Transcript Document

Additive Combinatorics in
Theoretical Computer Science
Shachar Lovett (UCSD)
What is Additive Combinatorics?
• Broad interpretation: study of subsets of algebraic
structures, relations between various notions of
structure (algebraic, combinatorial, statistical)
• Original motivation: number theory
• Recently, becoming an influential tool in theoretical
computer science
Additive Combinatorics in
Theoretical Computer Science
• Why is additive combinatorics useful in computer
science?
• Algebra is very useful: algorithm design, analysis,
problem analysis, lower bounds
• Additive combinatorics allows us to analyze also
approximate algebraic objects
Applications
• List of applications is constantly growing…
• Arithmetic complexity
• Communication complexity
• Cryptography
• Coding theory
• Randomness extractors
• Lower bounds
This talk
• For concreteness, I will focus on one mathematical
theme: structure in inner products
• I will describe applications in 4 domains:
•
•
•
•
Communication complexity
Coding theory
Cryptography
Randomness extractors
• Applications introduce new mathematical problems
Communication complexity
Structure of low rank Boolean matrices
Communication Complexity
• Two parties (Alice and Bob), each holding an input,
wish to jointly compute a function of their inputs,
while minimizing communication
x
f(x,y)
y
Communication Complexity
• Let x,y  [n], f(x,y)  {-1,+1}
• Identify function f(x,y) with n x n boolean matrix
+
+
+
+
+
-
+
+
+
+
+
-
+
+
+
+
+
-
+
+
+
+
+
+
-
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
x
f(x,y)
y
Communication Complexity
• Let x,y  [n], f(x,y)  {-1,+1}
• Identify function f(x,y) with n x n boolean matrix
• (Deterministic) Protocol = partition of matrix to
monochromatic rectangles
+
+
+
+
+
-
+
+
+
+
+
-
+
+
+
+
+
-
+
+
+
+
+
+
-
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
c bit protocol 
Partition to 2c monochromatic
rectangles
x
f(x,y)
y
The log-rank lower bound
• If f(x,y) has a c-bit protocol, it can be partitioned
to 2c monochromatic rectangles
• Monochromatic rectangles have rank 1
• Hence, #rectangles  rank of f
[Mehlhorn-Schmidt ’82]
+
+
+
+
+
-
+
+
+
+
+
-
+
+
+
+
+
-
+
+
+
+
+
+
-
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
The log-rank conjecture
• Log-rank conjecture: any Boolean matrix of rank r
(over the reals) can be partitioned to exp(logcr)
monochromatic rectangles
[Lovász-Saks ‘88]
• Sufficient to prove that all low rank matrices
contain at least one large monochromatic rectangle
[Nisan-Wigderson ‘95]
The log-rank conjecture
• Log-rank conjecture: Boolean matrice of rank r can be
partitioned to exp(logcr) monochromatic rectangles
[Lovász-Saks ‘88]
• Initial conjecture: c=1. We now know c1.63 [Kushilevitz’ 94]
• Equivalent conjectures formulated in other domains:
• Relation between rank and chromatic number of graphs
[Nuffelen’ 76, Fajtlowicz ’87]
• Relation between rank and positive rank for Boolean matrices
[Yannakakis ‘91]
Open problem 1: log rank conjecture
• A is an n x n real matrix with boolean entries, rank(A)=r
• Conjecture: A has a monochromatic rectangle of size
𝑛2 ⋅ exp −log 𝑂 1 𝑟
• Equivalently: if u1,…,un,v1,…,vnRr satisfy
<ui,vj>{-1,1} i,j
Then there are subsets I,J[n] with
<ui,vj>=const iI,jJ
where |I|,|J| ≥ 𝑛 ⋅ exp −log 𝑂
1
𝑟
+
+
+
+
+
-
+
+
+
+
+
-
+
+
+
+
+
-
+
+
+
+
+
+
-
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
Progress came from different
directions
• Trivial bound: #partitions  2r
• Graph theory: chromatic number  20.4r
[Kotlov-Lovasz ’96,Kotlov ’97]
• Additive combinatorics: assuming the polynomial Freiman-Ruzsa conjecture,
#partitions  exp(r/logr)
[BenSasson-L-Zewi’12]
• Discrepancy theory: #partitions  exp(r1/2 logr)
[L’14]
• Next step: ???
[your name here]
References
• Lovasz-Saks ’88: Lattices, Mobius Functions and Communication
Complexity
• Nisan-Wigderson ‘95: On Rank vs. Communication Complexity
• Kotlov ‘97: The rank and size of graphs
• BenSasson-Lovett-Zewi ‘12: An Additive Combinatorics Approach
Relating Rank to Communication Complexity
• Lovett ‘14: Communication is bounded by root of rank
Coding theory
Locally decodable codes
Traditional error correcting codes
• Classic error correcting codes:
NOISE
Message
Encoder
Codeword
Received
word
Decoder
Message
(hopefully)
• But what if we want just part of the message? We
still need to decode all of it.
Locally decodable code
• Allow to decode parts of the message efficiently
NOISE
Message
Encoder
Codeword
Received
word
Decoder
Part of
message
• Ideally, to decode S bits in message, read only O(S)
bits in codeword
Example: Hadamard code
• Message space = 0,1
𝑛
= F2𝑛
• Encode 𝑥 ∈ F2𝑛 by all linear functions: 𝑒(𝑥) = < 𝑎, 𝑥 >: 𝑎 ∈ 𝐹2𝑛
• Very redundant: encodes n bits to 2n bits
• BUT! Allows for𝑛local decoding: to decode a bit xi in the message,
choose 𝑎 ∈ 0,1 randomly, and output
𝑒 𝑥
𝑎
+𝑒 𝑥
𝑎+𝑒𝑖
• Even if 10% of the bits in the codeword are corrupted, would
return the correct answer with probability 80%, while reading
only 2 bits
Efficient locally decodable codes
• Challenge: locality and efficiency
• Polynomial codes (Reed-Muller) allow to encode n bits to
m=O(n) bits, but decoding even a single bit requires
reading n bits.
• Recent mutliplicity codes can do this and keep m=n(1+)
• Matching vector codes can achieve O(1) queries per bit,
but their length is sub-exponential
• Related to structured low rank matrices
Matching vector codes
• Matching vector family: u1,…,un,v1,…,vn(Zm)r such that
(i) <ui,vi>=0
(ii) <ui,vj>0 if ij
• Equivalently, inner product matrix is a rank-r matrix over Zm
with zero diagonal, and nonzero off diagonal
(
02314
10531
23011
33401
21330
)
Matching vector codes
• Matching vector family: u1,…,un,v1,…,vn(Zm)r such that
(i) <ui,vi>=0
(ii) <ui,vj>0 if ij
• Equivalently, inner product matrix is a rank-r matrix over Zm
with zero diagonal, and nonzero off diagonal
• Give codes encoding n symbols to N=mr symbols, locally
decodable with m queries [Yekhanin’08, Efremenko’09,DvirGopalan-Yekhanin’11,…]
• Goal: m small (constant), minimize rank r
Matching vector codes
• Goal: low rank n x n matrix over Zm such that
(
0
0
0
0
0
0
)
• How low can the rank be?
• m=2: rank  n-1
• m=3: rank  n1/2
• m=p prime: rank  n1/(p-1)
BUT
• m=6: rank  exp(log1/2n) ! [Grolmusz’00]
• m has t prime factors: rank  exp(log1/tn)
• Core of sub-exponential codes
Are these constructions tight?
• Fix m=6
• What is the minimal rank of an n x n matrix over Z6
with zero diagonal, nonzero off diagonal?
• Grolmusz: rank  exp(log1/2n)
• Trivial: rank  log n
• Assuming PFR conjecture, can show rank  log n * loglog n
[Bhowmick-Dvir-L’12]
• Challenge: bridge the gap!
Open question 2: matching vector
families
• Construct low rank matrices over Z6 (or any fixed
Zm) with zero diagonal, nonzero off diagonal
(
02314
10531
23011
33401
21330
)
• Or prove that the Grolmusz construction is optimal
References
• Grolmusz ‘00: Superpolynomial size set-systems with restricted
intersections mod 6 and explicit Ramsey graphs
• Yekhanin ’08: Towards 3-query locally decodable codes of subexponential
length
• Efremenko ’09: 3-query locally decodable codes of subexponential length
• Dvir-Gopalan-Yekhanin ‘11: Matching vector codes
• Bhowmick-Dvir-Lovett ‘12: New lower bounds for matching vector codes
Cryptography
Non-malleable codes
Error correcting codes
NOISE
Message
Encoder
Codeword
Received
word
Decoder
Message
(hopefully)
• ERRORS are random, caused by stochastic processes
(nature)
Cryptography
NOISE
Message
Encrypt
Codeword
Received
word
• ERRORS are caused by an adversary
Decrypt
Message
(hopefully)
Cryptography
• How to handle adversarial errors?
• Common solution: use computational hardness
• Eg assume adversary is computationally bounded, and
build encryptions which cannot be broken by such
adversary
• However, these only work based on unproven
assumptions. Can we build information theoretic
cryptography?
Information-theoretic cryptography
NOISE
Message
Encrypt
Codeword
Received
word
Decrypt
Message
(hopefully)
• If the adversary is not bounded, he can do anything:
• Decode message
• Change to other evil message
• Re-encode
• So, we need to put some information theoretic limitations
Split state model
• Information is encoded by 2 (or more) codewords.
• Assumption: each can be arbitrarily corrupted, but
w/o collaboration (no communication)
Codeword 1
Message
Received
word 1
Encrypt
Decrypt
Codeword 2
Received
word 2
Message
(hopefully)
Non malleability
• What cannot we prevent?
• Adversaries randomly corrupting the message
(decoded message random, should be rejected – need some CRC on
messages)
• Adversaries can agree ahead of time on target message m*, replace
codeword with correct codewords for m*
• We will decode m* correctly
• BUT! m* does not depend on m (the message that was sent)
• Ideally: adversaries should NOT be able
to make us decode m’=func(m).
Codeword 1
Message
Received
word 1
Encrypt
Decrypt
Codeword 2
Received
word 2
Message
(hopefully)
Potential construction
• Suggestion: mF, F is a finite field
• Encoding: choose two random vectors x,yFn conditioned
on <x,y>=m
• Adversaries: replace x with f(x), y with g(y)
• Decoder output m’=<f(x),g(y)>
• Question: in what ways can
<f(x),g(y)> depend on <x,y>?
Codeword 1
Message
Received
word 1
Encrypt
Decrypt
Codeword 2
Received
word 2
Message
(hopefully)
Potential construction
• mF encoded by x,yFn such that <x,y>=m
• Decode: m’=<f(x),g(y)>
• What can the adversaries do:
• Nothing: m’=m
• Random: m’ random (only subset of messages are legal)
• Constant: f(x)=a, g(y)=b, m’=<a,b> indep. of m
• Linear: f(x)=2x, g(y)=y, m’=2m (HMM!)
Codeword 1
Message
Received
word 1
Encrypt
Decrypt
Codeword 2
Received
word 2
Message
(hopefully)
Construction is good:
Inner products of functions
• Theorem: arbitrary adversaries reduce to affine
transformations (and these can be handled by inner codes)
[Aggarwal-Dodis-L’14]
• F prime field, n large enough (npoly log|F|).
• x,y Fn uniform,
• Any functions f,g:FnFn
Conjecture:
True for nO(1)
• Joint distribution (<x,y>, <f(x),g(y)>)  (U, aU+b) where
• UF uniform
• (a,b)F2 some distribution, independent of U
More challenges
• At the end, we can encode n bits to ~O(n7) bits
which two non-communicating adversaries cannot
corrupt
• Challenges:
• Reduce encoding to O(n) bits
• Handle adversaries with limited communication
References
• Dziembowski-Pietrzak-Wichs ‘10: Non-malleable codes
• Chabanne-Cohen-Flori-Patey ‘11. Non-malleable codes from the wiretap channel
• Liu-Lysyanskaya’ 12. Tamper and leakage resilience in the split-state
model.
• Dziembowski-Kazana-Obremski ‘13: Non-malleable codes from twosource extractors
• Aggarwal-Dodis-Lovett ‘14: Non-malleable Codes from Additive
Combinatorics.
• Cheraghchi-Guruswami ’14: Non-malleable coding against bit-wise and
split-state tampering.
Randomness extraction
Can you beat Bourgain?
Randomness extraction
• A “randomness extractor” (usually simple called
extractor) is a deterministic function, which can
take weak random sources, and combine them to give
an almost perfect random source
Weak random
source
Extractor
Perfect
randomness
• Applications: derandomization, cryptography
Two source extractor
• Function E: 0,1 𝑛 × 0,1 𝑛 → 0,1
extractor for min-entropy k if:
𝑟
is a two-source
• For any two subsets 𝐴, 𝐵 ⊂ 0,1 𝑛 of size 𝐴 , 𝐵 ≥ 2𝑘
• UA uniform over A, UB uniform over B, independent
• Then:
E 𝑈𝐴, 𝑈𝐵 ≈ 𝑈 0,1 𝑟
• That is, E maps two independent distributions, each
containing k random bits (hidden somewhere), to r bits
which are close to uniform. Ideally, r2k.
Two source extractor
• Let us focus on extracting just one random bit (r=1)
• Then, two-source extractors are a strengthening of
bi-partite Ramsey graphs
A
B
Ramsey graph:
If 𝐴 , 𝐵 ≥ 2𝑘 then
0 < 𝐸 𝐴, 𝐵
< 𝐴 |𝐵|
A
B
Two-source extractor:
If 𝐴 , 𝐵 ≥ 2𝑘 then
1
𝐸 𝐴, 𝐵 ≈ 𝐴 |𝐵|
2
The Hadamard extractor
• Define E: 0,1
× 0,1 𝑛 → {0,1} by
E(x,y)=<x,y> (mod 2)
• Claim: this is an extractor for min-entropy k=n/2
𝑛
• If 𝐴 , 𝐵 > 2𝑛/2 then E(A,B) cannot be constant (hence, it
defines a Ramsey graph)
• If 𝐴 , 𝐵 > 100 ∗ 2𝑛/2 then 𝐸 𝑈𝐴 , 𝑈𝐵 ≈ 𝑈 0,1
• This is tight: if 𝐴 ⊂ F2𝑛 subspace of dim n/2, 𝐵 = 𝐴⊥ ,
then E(A,B)=0.
Beating the Hadamard extractor
• The Hadamard extractor fails on orthogonal
subspaces
• Idea: find large subsets of {0,1}n which have small
intersection with subspaces
(actually, such that any large subsets of them grow
with addition)
[Bourgain’05]
Bourgain’s extractor
• Identify 0,1 𝑛 ⊂ 𝐹𝑝
• Extractor: 𝐸: 𝐹𝑝 × 𝐹𝑝 → {0,1},
E(x,y)=(xy+x2y2) mod 2
• Thm: this is an extractor for min entropy 𝑘 = 0.4999𝑛
[Bourgain’05]
• Reason: subset 𝑥, 𝑥 2 : 𝑥 ∈ 𝐹𝑝 ⊂ 𝐹𝑝2 has the property that
all large subsets of it expand with iterated addition
• Proof relies on sum-product theorem in finite fields
Open problem: beat Bourgain
• Give an explicit construction of a two-source
extractor for lower min-entropy
• Additive combinatorics seems like a useful tool
• Constructions are known for Ramsey graphs; maybe
these can be extended?
References
• Barak-Kindler-Shaltiel-Sudakov-Wigderson ‘05: Simulating independence: New
constructions of condensers, Ramsey graphs, dispersers, and extractors.
• Bourgain ’05: More on the sum-product phenomenon in prime fields and its
applications
• Barak-Impagliazzo-Wigderson ‘06: Extracting randomness using few independent
sources
• Rao ‘07: An exposition of Bourgain's 2-source extractor
• BenSasson-Zewi ‘11: From affine to two-source extractors via approximate duality
Summary
Application of additive combinatorics
• We discussed a specific phenomena: structure in
inner products, and saw 4 applications of it
• Communication complexity: Boolean inner products
• Coding theory: matching vector family
• Cryptography: inner products of arbitrary functions
• Randomness extraction: explicit sets which behave
randomly under inner products
Other applications
• Here are some applications we haven’t discussed:
• Arithmetic complexity: understanding the minimal
number of arithmetic operations (additions,
multiplications), required to compute polynomials (for
example: matrix multiplication, or FFT).
• Tightly related to incidence geometry
Other applications
• Sub-linear algorithms: algorithms which can analyze
global properties of huge objects, based only on local
statistics (and hence, they run in time independent
of the input size)
• Graph algorithms are based on graph regularity
• Algebraic algorithms are based on higher-order
Fourier analysis
Conclusions
• Additive combinatorics provides a useful toolbox to
attack many problems in theoretical computer
science
• Problems in computer science suggest many new
mathematical problems
Thank You!