Transcript slides
The Complexity of
Information-Theoretic
Secure Computation
Yuval Ishai
Technion
2014 European School of Information Theory
Information-Theoretic Cryptography
• Any question in cryptography that makes sense
even if everyone is computationally unbounded
• Typically: unconditional security proofs
• Focus of this talk:
Secure Multiparty Computation (MPC)
Talk Outline
• Gentle introduction to MPC
• Communication complexity of MPC
– PIR, LDC, and related problems
• Open problems
How much do we earn?
x4
x3
x5
xi
x2
x6
x1
Goal: compute xi without revealing anything else
A better way?
m3=m2+x3
x4
x3
m4=m3+x4
x5
m5=m4+x5
m2=m1+x2
m6-r
x2
x6
m6=m5+x6
m1=r+x1
x1
0≤r<M
Assumption: xi<M (say,
M=1010)
(+ and – operations carried modulo M)
A security concern
x4
x3
x5
x2
x6
m2=m1+x2
m1
x1
Resisting collusions
x4
r43
r51
x3
x5
r25
r32
r65
x2
x6
r12
x1
r16
xi + inboxi - outboxi
More generally
• P1,…,Pk want to securely compute f(x1,…,xk)
– Up to t parties can collude
– Should learn (essentially) nothing but the output
• Questions
– When is this at
all possible?
Secure
MPC
protocol
for f results for
Similar
feasibility
– How efficiently? security against malicious parties
• Information-theoretic (unconditional) security possible when t<k/2
[BGW88,CCD88,RB89]
• Computational security possible for any t
(under standard cryptographic assumptions) [Yao86,GMW87,CLOS02]
Or: information-theoretic security using correlated randomness [Kil88,BG89]
More generally
• P1,…,Pk want to securely compute f(x1,…,xk)
– Up to t parties can collude
– Should learn (essentially) nothing but the output
• Questions
– When is this at all possible?
– How efficiently?
• Several efficiency measures:
communication, randomness, rounds, computation
• Typical assumptions for rest of talk:
* t=1, k = small constant
* information-theoretic security
* “semi-honest” parties, secure channels
Communication Complexity
Fully Homomorphic Encryption
Gentry ‘09
• Settles main communication complexity questions in
complexity-based cryptography
– Even under “nice” assumptions! [BV11]
• Main open questions
– Further improve assumptions
– Improve practical computational overhead
• FHE >> PKE >> SKE >> one-time pad
One-Time Pads for MPC
[IKMOP13]
Trusted
Dealer
RA
RB
Alice(x)
f(x,y)
Bob(y)
f(x,y)
• Offline:
– Set G[u,v] = f[u-dx, v-dy] for random dx, dy
– Pick random GA,GB such that G = GA+GB
– Alice gets GA,dx Bob gets GB,dy
• Protocol on inputs (x,y):
– Alice sends u=x+dx, Bob sends v=y+dy
– Alice sends zA= RA[u,v], Bob sends zB= RB[u,v]
– Both output z=zA+zB
01101
21010
20120
01101
dx
dy
3-Party MPC for g(x,y,z)
• Define f((x,zA),(y,zB)) = g(x,y,zA+zB)
RA
zA
Alice(x)
Carol (z)
g(x,y,z)
zB
RB
Bob(y)
One-Time Pads for MPC
• The good:
– Perfect security
– Great online communication
• The bad:
– Exponential offline communication
• Can we do better?
– Yes if f has small circuit complexity
– Idea: process circuit gate-by-gate
• k=3, t=1: can use one-time pad approach
• k>2t: use “multiplicative” (aka MPC-friendly) codes
• Communication circuit size, rounds circuit depth
MPC vs. Communication Complexity
a
b
c
Goal
Communication
Complexity
MPC
Each party learns
f(a,b,c)
Each party learns
only f(a,b,c)
MPC vs. Communication Complexity
a
b
c
Communication
Complexity
MPC
Goal
Each party learns
f(a,b,c)
Each party learns
only f(a,b,c)
Upper bound
O(n)
(n = input length)
O(size(f))
[BGW88,CCD88]
MPC vs. Communication Complexity
a question:
b
Big open
poly(n) communication for all f ?
“fully homomorphiccencryption of
information-theoretic cryptography”
Communication
Complexity
MPC
Goal
Each party learns
f(a,b,c)
Each party learns
only f(a,b,c)
Upper bound
O(n)
(n = input length)
O(size(f))
[BGW88,CCD88]
Lower bound
(n)
(for most f)
(n)
(for most f)
Question Reformulated
Is the communication complexity of MPC strongly correlated with
the computational complexity of the function being computed?
All functions
efficiently
computable
functions
= communication-efficient MPC
= no communication-efficient MPC
[KT00]
[IK04]
1990
1995
2000
• The three problems are closely related
Private Information Retrieval
[Chor-Goldreich-Kushilevitz-Sudan95]
database x∈{0,1}n
?
Main question:
minimize communication
(logn vs. n)
?
?
xi
“InformationTheoretic”
vs.
Computational
A Simple I.T. PIR Protocol
n1/2
X
S1
n1/2
S2
i
q1
a1=X·q1
q1 + q2 = ei
q2
a2=X·q2
a1+a2=X·ei
i
2-server PIR with O(n1/2) communication
A Simple Computational PIR Protocol
[Kushilevitz-Ostrovsky97]
Tool: (linear) homomorphic encryption
a
b
Protocol:
=
a+b
Client sends E(ei)•
E(0) E(0) E(1) E(0) (=c1 c2 c3 c4)
n1/2
0110
1 1 1 0 n1/2
X= 1 1 0 0
000
1
i
Server replies with E(X·ei)•
c2c3
c1 c2c3
c1c2
c4
Client recovers ith column of X•
1-server CPIR with ~ O(n1/2) communication
Why Information-Theoretic PIR?
Cons:
• Requires multiple servers
• Privacy against limited collusions
• Worse asymptotic complexity (with const. k):
2(logn)^ [Yekhanin07,Efremenko09] vs.
polylog(n) [Cachin-Micali-Stadler99, Lipmaa05, Gilboa-I14]
Pros:
•
•
•
•
Interesting theoretical question
Unconditional privacy
Better “real-life” efficiency
Allows for very short (logarithmic) queries or very short
(constant-size) answers
• Closely related to locally decodable codes & friends
Locally Decodable Codes
n
{0,1}
x
y
m
i
Requirements:
• High robustness
• Local decoding
If < 1% of y is corrupted,
xi is recovered w/prob > 0.51
Question: how large should m(n) be in a k-query LDC?
k=2: 2(n)
k=3: 22^O~(sqrt(logn)) (n2)
From I.T. PIR to LDC
[Katz-Trevisan00]
Simplifying assumptions:
• Servers compute same function of (x,q)
• Each query is uniform over its support set
k-server PIR with
-bit queries and
-bit answers
k-query LDC
of length 2
over ={0,1}
y[q]=Answer(x,q)
• Uniform
PIR queries
one
“smooth”
LDC
Binary LDC
PIR with
answer bit
perdecoder
server
robustness
• Arrows can be reversed
Applications of Local Decoding
• Coding
– LDC, Locally Recoverable Codes (robustness)
– Batch Codes (load balancing)
• Cryptography
– Instance Hiding, PIR (secrecy)
– Efficient MPC for “worst” functions
• Complexity theory
– Locally random reductions, PCPs
– Worst-case to average-case reductions,
hardness amplification
Complexity of PIR: Total Communication
• Mainly interesting for k=2
• Upper bound (k=2): O(n1/3) [CGKS95]
– Tight in a restricted model [RY07]
• Lower bound (k=2): 5logn [Man98,…,WW05]
• No natural coding analogue
Complexity of PIR: Short Answers
• Short answers = O(1) bit from each server
– Closely related to k-query binary LDCs
• k=2
– Simple O(n) upper bound [CGKS05]
• PIR analogue of Hadamard code
– Ω(n) lower bound [GKST02, KdW04]
• k > logn / loglogn
– Simple polylog(n) upper bound [BF90,CGKS05]
• PIR analogue of RM code
– Binary LDCs of length poly(n) and k=polylog(n) queries
Complexity of PIR: Short Answers
• k=3
– Lower bound
• [KdW04,…,Woo07]
2logn
– Upper bounds
• [CGKS95]
O(n1/2)
• [Yekhanin07] nO(1/loglogn)
• [Efremenko09]
nO~(1/sqrt(logn))
Assuming infinitely
many Mersenne primes
More practical variant
[BIKO12]
Complexity of PIR: Short Answers
• k=4,5,6,…
– Lower bound
• [KdW04,…,Woo07]
c(k).logn
– Upper bounds
• [CGKS95]
O(n1/k-1)
• [Yekhanin07] nO(1/loglogn)
• [Efremenko09]
nO~(1/(logn)^c’(k))
Assuming infinitely
many Mersenne primes
Complexity of PIR: Short Queries
• Short queries = O(logn) bit to each server
– Closely related to poly(n)-length LDCs over large Σ
– Application: PIR with preprocessing [BIM00]
• k=2,3,4,…
– Answer length = O(n1/k+ε) [BI01]
– Lower bounds: ???
Complexity of PIR: Low Storage
• Different servers may store different functions of x
– Goal: minimize communication subject to storage rate=1-ε
– Corresponds to binary LDCs with rate 1-ε
• Rate = 1-ε, k=O(nε), 1-bit answers
– Multiplicity codes [DGY11]
– Lifting of affine-invariant codes [GKS13]
– Expander codes [HOW13]
Best 2-Server PIR
[CGKS95,BI01]
• Reduce to private polynomial evaluation over F2
–
–
–
–
Servers: x p = degree-3 polynomial in m≈n1/3 vars.
Client: i z ∈ F2m
Local mappings must satisfy px(zi)=xi for all x,i
Simple implementation: z(i) = i-th weight-3 binary vector
• Privately evaluate p(z)
– Client:
• splits z into z=a+b, where a,b are random
• sends a to S1 and b to S2
– Servers:
• write p(z)=p(a+b) as pa(b)+pb(a) where deg(pa),deg(pb) ≤ 1,
pa known to S1, and pb known to S2
• Send descriptions of pa,pb to Client, who outputs pa(b)+pb(a)
• d=O(logn) O(logn)-bit queries, O(n1/2+ε)-bit answers
Tool: Secret Sharing
• Randomized mapping of secret s to shares (s1,s2,…,sk)
– Linear secret sharing: shares = L(s,r1,…,rm)
• Access structure: subset A of 2[k] specifying authorized sets
– Sets of shares not in A should reveal nothing about s
– Optimal share complexity for given A is wide open
– Here: k=3, each share hides s, all shares determine s
• Useful examples for linear schemes
–
–
–
–
Additive sharing: s=s1+s2+s3
Shamir’s secret sharing: si=p(i) where p(x)=s+rx
CNF secret sharing: s=r1+r2+r3, s1=(r2,r3), s2=(r1,r3), s3=(r2,r3)
CNF is “maximal”, Additive is “minimal”
• For any linear scheme: [v], x [<v,x>] (without interaction)
– PIR with short answers reduces to client sharing [ei] while hiding i
– Enough to share a multiple of [ei]
Tool: Matching Vectors
[Yek07,Efr09, DGY10]
• Vectors u1,…,un in Zmh are S-matching if:
– <ui,ui> = 0
– <ui,uj> ∈ S (0∉S)
• Surprising fact: super-polynomial n(h) when m is a composite
– For instance, n=hO(logh) for m=6, S={1,3,4}
– Based on large set systems with restricted intersections modulo m [BF80, Gro00]
• Matching vectors can be used to compress “negated” shared unit vector
– [v] = [<ui,u1>, <ui,u2>, …,<ui,un>]
– v is 0 only in i-th entry
• Apply local share conversion to obtain shares of [v’], where v’ is nonzero
only in i-th entry
– Efremenko09: share conversion from Shamir’ to additive, requires large m
– Beimel-I-Kushilevitz-Orlov12: share conversions from CNF to additive, m=6,15,…
Matching Vectors & Circuits
Actual dimension wide open; related to size of:
• Set systems
intersections
[BF80, Gro00]
mod
mod
mod
mod
modrestricted
mod with
• Matching vector sets [Yek07,Efr09, DGY10]
• Degree of representing “OR” modulo m [BBR92]
x1
2h^logh <
6
6
6
6
x2
6
x3
xh
VC-dim
<< 22^h
6
Share Conversion
Given: CNF shares of s mod 6
s=0 s’0
s=1,3,4
s0 s’=0
Big Set System with Limited
mod-6 Intersections
• Goal: find N subsets Ti of [h] such that:
– |Ti|1
– |TiTj| {0,3,4}
(mod 6)
(mod 6)
• h = query length; N = database size
• [Frankl83]: h=
𝑟
2
, N=
𝑟−3
8
– h 7N1/4
• Better asymptotic constructions exist
Big Set System with Limited
mod-6 Intersections
r-clique
3
h=
𝑟
2
; N=
|TiTj|=
𝑡
2
𝑟−3
8
; |Ti|=
11
2
=551 (mod 6)
, 3t 10 {0,3,4} (mod 6)
PIR MPC
• Arbitrary polylogarithmic 3-server PIR
MPC with poly(|input|) communication [IK04]
• Applications of computationally efficient PIR [BIKK14]
– 2-server PIR OT-complexity of secure 2-party computation
– 3-server PIR Correlated randomness complexity
• Applications of “decomposable” PIR [BIKK14]
– Private simultaneous messages protocols
– Secret-sharing for graph access structures
Open Problems: PIR and LDC
• Understand limitations of current techniques
– Better bounds on matching vectors?
– More powerful share conversions?
• t-private PIR with no(1) communication
– Known with 3t servers [Barkol-I-Weinreb08]
– Related to locally correctable codes
• Any savings for (classes) of polynomial-time
f:{0,1}n{0,1} ?
• Barriers for strong lower bounds?
– [Dvir10]: strong lower bounds for locally correctable codes
imply explicit rigid matrices and size-depth lower bounds.
Open Problems: MPC
• High end: understand complexity of “worst” f
– O(2n^) vs. (n)
– Closely related to PIR and LDC
• Mid range: nontrivial savings for “moderately hard” f?
• Low end: bounds on amortized rate of finite f
– In honest-majority setting
– Given noisy channels