Transcript slides

The Complexity of
Information-Theoretic
Secure Computation
Yuval Ishai
Technion
2014 European School of Information Theory
Information-Theoretic Cryptography
• Any question in cryptography that makes sense
even if everyone is computationally unbounded
• Typically: unconditional security proofs
• Focus of this talk:
Secure Multiparty Computation (MPC)
Talk Outline
• Gentle introduction to MPC
• Communication complexity of MPC
– PIR, LDC, and related problems
• Open problems
How much do we earn?
x4
x3
x5
xi
x2
x6
x1
Goal: compute xi without revealing anything else
A better way?
m3=m2+x3
x4
x3
m4=m3+x4
x5
m5=m4+x5
m2=m1+x2
m6-r
x2
x6
m6=m5+x6
m1=r+x1
x1
0≤r<M
Assumption: xi<M (say,
M=1010)
(+ and – operations carried modulo M)
A security concern
x4
x3
x5
x2
x6
m2=m1+x2
m1
x1
Resisting collusions
x4
r43
r51
x3
x5
r25
r32
r65
x2
x6
r12
x1
r16
xi + inboxi - outboxi
More generally
• P1,…,Pk want to securely compute f(x1,…,xk)
– Up to t parties can collude
– Should learn (essentially) nothing but the output
• Questions
– When is this at
all possible?
Secure
MPC
protocol
for f results for
Similar
feasibility
– How efficiently? security against malicious parties
• Information-theoretic (unconditional) security possible when t<k/2
[BGW88,CCD88,RB89]
• Computational security possible for any t
(under standard cryptographic assumptions) [Yao86,GMW87,CLOS02]
Or: information-theoretic security using correlated randomness [Kil88,BG89]
More generally
• P1,…,Pk want to securely compute f(x1,…,xk)
– Up to t parties can collude
– Should learn (essentially) nothing but the output
• Questions
– When is this at all possible?
– How efficiently?
• Several efficiency measures:
communication, randomness, rounds, computation
• Typical assumptions for rest of talk:
* t=1, k = small constant
* information-theoretic security
* “semi-honest” parties, secure channels
Communication Complexity
Fully Homomorphic Encryption
Gentry ‘09
• Settles main communication complexity questions in
complexity-based cryptography
– Even under “nice” assumptions! [BV11]
• Main open questions
– Further improve assumptions
– Improve practical computational overhead
• FHE >> PKE >> SKE >> one-time pad
One-Time Pads for MPC
[IKMOP13]
Trusted
Dealer
RA
RB
Alice(x)
f(x,y)
Bob(y)
f(x,y)
• Offline:
– Set G[u,v] = f[u-dx, v-dy] for random dx, dy
– Pick random GA,GB such that G = GA+GB
– Alice gets GA,dx Bob gets GB,dy
• Protocol on inputs (x,y):
– Alice sends u=x+dx, Bob sends v=y+dy
– Alice sends zA= RA[u,v], Bob sends zB= RB[u,v]
– Both output z=zA+zB
01101
21010
20120
01101
dx
dy
3-Party MPC for g(x,y,z)
• Define f((x,zA),(y,zB)) = g(x,y,zA+zB)
RA
zA
Alice(x)
Carol (z)
g(x,y,z)
zB
RB
Bob(y)
One-Time Pads for MPC
• The good:
– Perfect security
– Great online communication
• The bad:
– Exponential offline communication
• Can we do better?
– Yes if f has small circuit complexity
– Idea: process circuit gate-by-gate
• k=3, t=1: can use one-time pad approach
• k>2t: use “multiplicative” (aka MPC-friendly) codes
• Communication  circuit size, rounds  circuit depth
MPC vs. Communication Complexity
a
b
c
Goal
Communication
Complexity
MPC
Each party learns
f(a,b,c)
Each party learns
only f(a,b,c)
MPC vs. Communication Complexity
a
b
c
Communication
Complexity
MPC
Goal
Each party learns
f(a,b,c)
Each party learns
only f(a,b,c)
Upper bound
O(n)
(n = input length)
O(size(f))
[BGW88,CCD88]
MPC vs. Communication Complexity
a question:
b
Big open
poly(n) communication for all f ?
“fully homomorphiccencryption of
information-theoretic cryptography”
Communication
Complexity
MPC
Goal
Each party learns
f(a,b,c)
Each party learns
only f(a,b,c)
Upper bound
O(n)
(n = input length)
O(size(f))
[BGW88,CCD88]
Lower bound
(n)
(for most f)
(n)
(for most f)
Question Reformulated
Is the communication complexity of MPC strongly correlated with
the computational complexity of the function being computed?
All functions
efficiently
computable
functions
= communication-efficient MPC
= no communication-efficient MPC
[KT00]
[IK04]
1990
1995
2000
• The three problems are closely related
Private Information Retrieval
[Chor-Goldreich-Kushilevitz-Sudan95]
database x∈{0,1}n
?
Main question:
minimize communication
(logn vs. n)
?
?
xi
“InformationTheoretic”
vs.
Computational
A Simple I.T. PIR Protocol
n1/2
X
S1
n1/2
S2
i
q1
a1=X·q1
q1 + q2 = ei
q2
a2=X·q2
a1+a2=X·ei
i
 2-server PIR with O(n1/2) communication
A Simple Computational PIR Protocol
[Kushilevitz-Ostrovsky97]
Tool: (linear) homomorphic encryption
a
 b
Protocol:
=
a+b
Client sends E(ei)•
E(0) E(0) E(1) E(0) (=c1 c2 c3 c4)
n1/2
0110
1 1 1 0 n1/2
X= 1 1 0 0
000
1
i
Server replies with E(X·ei)•
c2c3
c1 c2c3
c1c2
c4
Client recovers ith column of X•
 1-server CPIR with ~ O(n1/2) communication
Why Information-Theoretic PIR?
Cons:
• Requires multiple servers
• Privacy against limited collusions
• Worse asymptotic complexity (with const. k):
2(logn)^ [Yekhanin07,Efremenko09] vs.
polylog(n) [Cachin-Micali-Stadler99, Lipmaa05, Gilboa-I14]
Pros:
•
•
•
•
Interesting theoretical question
Unconditional privacy
Better “real-life” efficiency
Allows for very short (logarithmic) queries or very short
(constant-size) answers
• Closely related to locally decodable codes & friends
Locally Decodable Codes
n

{0,1}
x
y
 m
i
Requirements:
• High robustness
• Local decoding
If < 1% of y is corrupted,
xi is recovered w/prob > 0.51
Question: how large should m(n) be in a k-query LDC?
k=2: 2(n)
k=3: 22^O~(sqrt(logn)) (n2)
From I.T. PIR to LDC
[Katz-Trevisan00]
Simplifying assumptions:
• Servers compute same function of (x,q)
• Each query is uniform over its support set
k-server PIR with
-bit queries and
-bit answers
k-query LDC
of length 2
over ={0,1}
y[q]=Answer(x,q)
• Uniform
PIR queries
one
“smooth”
LDC
Binary LDC
 PIR with
answer bit
perdecoder
server 
robustness
• Arrows can be reversed
Applications of Local Decoding
• Coding
– LDC, Locally Recoverable Codes (robustness)
– Batch Codes (load balancing)
• Cryptography
– Instance Hiding, PIR (secrecy)
– Efficient MPC for “worst” functions
• Complexity theory
– Locally random reductions, PCPs
– Worst-case to average-case reductions,
hardness amplification
Complexity of PIR: Total Communication
• Mainly interesting for k=2
• Upper bound (k=2): O(n1/3) [CGKS95]
– Tight in a restricted model [RY07]
• Lower bound (k=2): 5logn [Man98,…,WW05]
• No natural coding analogue
Complexity of PIR: Short Answers
• Short answers = O(1) bit from each server
– Closely related to k-query binary LDCs
• k=2
– Simple O(n) upper bound [CGKS05]
• PIR analogue of Hadamard code
– Ω(n) lower bound [GKST02, KdW04]
• k > logn / loglogn
– Simple polylog(n) upper bound [BF90,CGKS05]
• PIR analogue of RM code
– Binary LDCs of length poly(n) and k=polylog(n) queries
Complexity of PIR: Short Answers
• k=3
– Lower bound
• [KdW04,…,Woo07]
2logn
– Upper bounds
• [CGKS95]
O(n1/2)
• [Yekhanin07] nO(1/loglogn)
• [Efremenko09]
nO~(1/sqrt(logn))
Assuming infinitely
many Mersenne primes
More practical variant
[BIKO12]
Complexity of PIR: Short Answers
• k=4,5,6,…
– Lower bound
• [KdW04,…,Woo07]
c(k).logn
– Upper bounds
• [CGKS95]
O(n1/k-1)
• [Yekhanin07] nO(1/loglogn)
• [Efremenko09]
nO~(1/(logn)^c’(k))
Assuming infinitely
many Mersenne primes
Complexity of PIR: Short Queries
• Short queries = O(logn) bit to each server
– Closely related to poly(n)-length LDCs over large Σ
– Application: PIR with preprocessing [BIM00]
• k=2,3,4,…
– Answer length = O(n1/k+ε) [BI01]
– Lower bounds: ???
Complexity of PIR: Low Storage
• Different servers may store different functions of x
– Goal: minimize communication subject to storage rate=1-ε
– Corresponds to binary LDCs with rate 1-ε
• Rate = 1-ε, k=O(nε), 1-bit answers
– Multiplicity codes [DGY11]
– Lifting of affine-invariant codes [GKS13]
– Expander codes [HOW13]
Best 2-Server PIR
[CGKS95,BI01]
• Reduce to private polynomial evaluation over F2
–
–
–
–
Servers: x  p = degree-3 polynomial in m≈n1/3 vars.
Client: i  z ∈ F2m
Local mappings must satisfy px(zi)=xi for all x,i
Simple implementation: z(i) = i-th weight-3 binary vector
• Privately evaluate p(z)
– Client:
• splits z into z=a+b, where a,b are random
• sends a to S1 and b to S2
– Servers:
• write p(z)=p(a+b) as pa(b)+pb(a) where deg(pa),deg(pb) ≤ 1,
pa known to S1, and pb known to S2
• Send descriptions of pa,pb to Client, who outputs pa(b)+pb(a)
• d=O(logn)  O(logn)-bit queries, O(n1/2+ε)-bit answers
Tool: Secret Sharing
• Randomized mapping of secret s to shares (s1,s2,…,sk)
– Linear secret sharing: shares = L(s,r1,…,rm)
• Access structure: subset A of 2[k] specifying authorized sets
– Sets of shares not in A should reveal nothing about s
– Optimal share complexity for given A is wide open
– Here: k=3, each share hides s, all shares determine s
• Useful examples for linear schemes
–
–
–
–
Additive sharing: s=s1+s2+s3
Shamir’s secret sharing: si=p(i) where p(x)=s+rx
CNF secret sharing: s=r1+r2+r3, s1=(r2,r3), s2=(r1,r3), s3=(r2,r3)
CNF is “maximal”, Additive is “minimal”
• For any linear scheme: [v], x  [<v,x>] (without interaction)
– PIR with short answers reduces to client sharing [ei] while hiding i
– Enough to share a multiple of [ei]
Tool: Matching Vectors
[Yek07,Efr09, DGY10]
• Vectors u1,…,un in Zmh are S-matching if:
– <ui,ui> = 0
– <ui,uj> ∈ S (0∉S)
• Surprising fact: super-polynomial n(h) when m is a composite
– For instance, n=hO(logh) for m=6, S={1,3,4}
– Based on large set systems with restricted intersections modulo m [BF80, Gro00]
• Matching vectors can be used to compress “negated” shared unit vector
– [v] = [<ui,u1>, <ui,u2>, …,<ui,un>]
– v is 0 only in i-th entry
• Apply local share conversion to obtain shares of [v’], where v’ is nonzero
only in i-th entry
– Efremenko09: share conversion from Shamir’ to additive, requires large m
– Beimel-I-Kushilevitz-Orlov12: share conversions from CNF to additive, m=6,15,…
Matching Vectors & Circuits

Actual dimension wide open; related to size of:
• Set systems
intersections
[BF80, Gro00]
mod
mod
mod
mod
modrestricted
mod with
• Matching vector sets [Yek07,Efr09, DGY10]
• Degree of representing “OR” modulo m [BBR92]
x1
2h^logh <
6
6
6
6
x2
6
x3
xh
VC-dim
<< 22^h
6
Share Conversion
Given: CNF shares of s mod 6
s=0  s’0
s=1,3,4
s0  s’=0
Big Set System with Limited
mod-6 Intersections
• Goal: find N subsets Ti of [h] such that:
– |Ti|1
– |TiTj|  {0,3,4}
(mod 6)
(mod 6)
• h = query length; N = database size
• [Frankl83]: h=
𝑟
2
, N=
𝑟−3
8
– h  7N1/4
• Better asymptotic constructions exist
Big Set System with Limited
mod-6 Intersections
r-clique
3
h=
𝑟
2
; N=
|TiTj|=
𝑡
2
𝑟−3
8
; |Ti|=
11
2
=551 (mod 6)
, 3t 10  {0,3,4} (mod 6)
PIR  MPC
• Arbitrary polylogarithmic 3-server PIR 
MPC with poly(|input|) communication [IK04]
• Applications of computationally efficient PIR [BIKK14]
– 2-server PIR  OT-complexity of secure 2-party computation
– 3-server PIR  Correlated randomness complexity
• Applications of “decomposable” PIR [BIKK14]
– Private simultaneous messages protocols
– Secret-sharing for graph access structures
Open Problems: PIR and LDC
• Understand limitations of current techniques
– Better bounds on matching vectors?
– More powerful share conversions?
• t-private PIR with no(1) communication
– Known with 3t servers [Barkol-I-Weinreb08]
– Related to locally correctable codes
• Any savings for (classes) of polynomial-time
f:{0,1}n{0,1} ?
• Barriers for strong lower bounds?
– [Dvir10]: strong lower bounds for locally correctable codes
imply explicit rigid matrices and size-depth lower bounds.
Open Problems: MPC
• High end: understand complexity of “worst” f
– O(2n^) vs. (n)
– Closely related to PIR and LDC
• Mid range: nontrivial savings for “moderately hard” f?
• Low end: bounds on amortized rate of finite f
– In honest-majority setting
– Given noisy channels