On necessary and sufficient cryptographic assumptions: the case of memory checking Lecture 3 : Memory Checking, Consecutive Messages Protocols and Learning Distributions Lecturer: Moni.

Download Report

Transcript On necessary and sufficient cryptographic assumptions: the case of memory checking Lecture 3 : Memory Checking, Consecutive Messages Protocols and Learning Distributions Lecturer: Moni.

On necessary and sufficient cryptographic
assumptions: the case of memory checking
Lecture 3 : Memory Checking, Consecutive
Messages Protocols and Learning Distributions
Lecturer: Moni Naor
Weizmann Institute of Science
Web site of lectures: www.wisdom.weizmann.ac.il/~naor/COURSE/ens.html
Recap of Lecture 2
• The authentication problem
• Authentication use universal hash functions
• The Sub-linear authentication problem
– A pretty good authenticator based on one-way functions
• Communication complexity
• Deterministic and Probabilistic protocols
– The equality function
• Simultaneous Message Protocols
– Tight ( sqrt(n)) bound for equality
Authenticators for a large and unreliable memory
s(n) bits
secret
encoding sx
x
{0,1}n
•
E
Encoding Algorithm E:
V
public encoding
public encoding
px
py
accept
reject
t(n) bits
D
x
y
– Receives a vector x 2 {0,1}n, encodes:
• a public encoding px
• a small secret encoding sx. Space complexity: s(n)
•
Decoding Algorithm D:
•
Consistency Verifier V:
•
An adversary sees (only) the public encodings and can change them
– Receives a public encoding p and decodes it into a vector x  {0,1}n
– Receives public px’ and secret sx encodings, verifies whether decoder output = encoder input
– Makes few queries to public encoding: query complexity: t(n)
Pretty Good Authenticator
• Idea: encode X using a good error correcting code C
– Actually erasures are more relevant
– As long as a certain fraction of the symbols of C(X) is available, can decode X
– Good example: Reed Solomon code
• Add to each symbol a tag Fk(a,i), a function of
• secret information k 2 {0,1}s,
• symbol a 2 
• location i
• Verifiers picks random location i reads symbol ’a’ and tag t
– Check whether t=Fk(a,i) and rejects if not
• Decoding process removes all inappropriate tags and uses the
decoding procedure of C
The Complexity of Authentication
• V works many times
– The adversary can modify the public encoding only
between activations of V
• Want both the space complexity s(n) and the
query complexity t(n) to be small
– Much smaller than the file size n
For the pretty good authenticator s(n)¢t(n) ¿ n
Online Memory Checking
• Example: Verify information in your GMail account.
Download the entire 2GB to verify a message?
• Don’t verify the entire 2GB, just what you read:
– How much of the file do you read per bit that you
retrieve and authenticate?
How large a fingerprint do you need?
– Online memory checkers
Memory Checkers
How to check a large and unreliable memory
• Store and retrieve requests to large adversarial memory
– a vector in {0,1}n
• Detects whether the answer to any retrieve was different than
the last store
• Uses small, secret, reliable memory: space complexity s(n)
• Makes its own store and retrieve requests:
query complexity t(n)
t(n) bits
C
U
memory checker
user
S
s(n) bits
secret memory
P
public memory
Types of Checkers
When are errors detected?
Offline vs. Online detection:
– Offline: at the end of a (long) series of requests, the
checkers detects whether there was an error at some
point
– Online: errors are detected whenever they occur
When a retrieve gave a different value than the last store
Computational assumptions and memory checkers
• For offline memory checkers no computational
assumptions are needed:
Probability of detecting errors: ε
Query complexity: t(n)=O(1) (amortized)
Space complexity: s(n)=O(log n + log 1/ε)
[Blum, Evans, Gemmel, Kannan and Naor 1991]
• For online memory checkers with computational
assumptions, good schemes are known:
Query complexity t(n)=O(log n)
Space complexity s(n)=n (for any  > 0)
Probability of not detecting errors: negligible
Are they
necessary?!
Next lecture: yes computational assumptions are necessary
Memory Checker  Authenticator
If there exists an online memory checker with
– space complexity s(n)
– query complexity t(n)
then there exists an authenticator with
– space complexity O(s(n))
– query complexity O(t(n))
Idea: Use a high-distance code
An Upper Bound
An online memory checker:
• Divide the n-bit file into chunks of size t(n)
• Use a hash function from an almost-universal family
h: {0,1}t(n){0,1}c
(size: O(log n) bits)
• Save in private memory a hash of each chunk
• Query Complexity: t(n)
• Space Complexity: s(n) = O(n/t(n))
Improve the Upper Bound?
• For Memory Checkers:
What about Locally Decodable Codes?
• For Authenticators:
What about PCPs of proximity?
An offline memory checker
Idea: reduce to the string/set comparison problem
• The two string
– The bits read define R
– The bits written define W
Have to make sure that under normal circumstances R = W
•
•
•
•
Read position before each write
Read the full string at the end
For each position in memory
Add timestamps
– Need a hash function h that can be computed `on the fly’
Using inner product
• Define R and W as long vectors
• Add time stamps since locations change
– for each 1· i · n
– for each possible time t
– For each possible value v
• Have a bit position in R and W vectors are sparse
1
1
1
1
1
The function h defines a long vector Vh
h(R) = Vh ¢ R
The vector Vh should be such
•
– If R ≠ W then Vh ¢ R ≠ Vh ¢W
– Possible to evaluate Vh(i,v,t) easily
00 1 1 0 1 1 1 0 1 0 1 0 1 00 1 1 0 1 1 0000 1 0 1 0 1 1 0 1 1
Small bias probability spaces
Let  be a probability space
– with N random variables x1, x2,… xN obtaining values in {0,1}.
We say that it is -biased if for any subset S µ {1…N}
|Pr[ ©i 2 S xi =1] - Pr[ ©i2 S xi =0]| · 
A probability space is 0-biased iff it is the uniform distribution on
x1, x2, …, xN
– Size 2N
Much smaller spaces exist for >0
Description of a point can be O(log(K/))
Want an efficient way to compute xi from the representation of the point
in the sample space.
Should be polynomial in log i and the representation of the sample point
A construction based on quadratic
residuousity
Let P be a large prime
Must be À N
Each h is defined by 0 · a · P-1
ha(x) quadratic character of x+a mod P
This is a random shift of the character string
Possible to come up with
contsruction of size
O(log n + log t+log 1/}
Property: the resulting probability space is -biased for  =N2/P
Analysis based on Weil’s Theorem
The down side: computationally expensive
each such function discovers error with probability at most ½
The probability that the Xor of the subset corresponding R-W is 1
Are we done?
A slightly different look: multi-sets
• Define R and W as sets
– R={(i,t,v)| location i was read with value v and timestamp t}
– W={(i,t,v)| location i was written with value v at time t}
• To check that two multi-set are the same:
• Define the polynomial pS of the set S
pS = a 2 S (x-a)
Can compute the polynomial pS on the fly in any order S is given
no polynomials of two different (multi) sets agree on more than their size many points
Choose a random r and compare pR(r) and pW(r)
Probability they are equal if R ≠ W is at mot |T|/Size of Field
Are we done?
Problem: replay attack
• What happens if the adversary gives the reader the
correct values in the wrong order?
• Luckily: time progresses.
Add a check that the time stamp is not larger than the current
time
The Checker
Initialization: set W=R= and h(W) = h(R) = 0
On Read of address i at current
time t
– Get the value v and timestamp t’
– Check that t’ is less than t
– Update the hash function h(R)
On Write to address i with value v
at current time t
flag
– Get the value v’ and timestamp t’
– Check that t’ is less than t
– Update the hash function h(R)
Using the multi-set approach over
GF[p3] with random r:
multiply current register by
vp2+vpi+t-r
– Write value v and timestamp t at
address i
– Update the hash function h(W) with
value (v,a,t)
– Write new value v and timestamp t
at address i
– Update the hash function h(W)
with value (v,a,t)
At the end: checkers reads all the memory cells and updates h(R) accordingly
compares h(W) with h(R)
Analysis
Lemma: if the RAM malfunctions and no flag raised, then R ≠ W
A malfunction occurs if the (v,t) the checker reads from an address are
different from (v,t) of the corresponding write
Let (v,t,i) be the triple with the highest value t with a corresponding
errant read
Claim: no other read operation returns (v,t,i) value
– no read after t can return (v,t,i), since then there is a higher
timestamp than t with an errant read
– no read before t can return (v,t,i), since then it would be flagged
From the lemma and the properties of h we have that a
malfunctioning RAM is caught with probability at least 1-
Invasive Memory Checkers
• A memory checker is called invasive if it changes
anything in the main memory
Ajtai [2003]: Even an offline memory checker must be
invasive must be for any non-trivial result
Main point: handling replay attacks
Proof technique: reduction to element distinctness
On-line memory checkers
• Idea add tags as in the pretty good authenticator
Fk(v,i)
– Fk is a pseudo random function
• Problem: replay attacks
• Solution: tree of tags
Recall: one-way functions exist iff
pseudo-random functions exist
– leaves correspond to entries in the memory
• Store in each internal node the sum of the value of its two
children
– Plus an authentication
• Verification of a leaf: access all node on the path from root
– plus their siblings
s(n) is the key size
t(n) is log n x tag size
Tree of tags
U1
U2
U3
U9=U1+U2,
Fk(U9)
U4
U5
U10
U6
U7
U11
U12
U14
U13
U15
U8
An alternative construction
• Use a tree of hashes
• Store in secret memory the root of the tree
– Does not need to be secret – only reliable
• What property should the hash function have?
• What complexity assumption is needed?
Possible definitions
• A function g:{0,1}2k → {0,1}k where it is hard to
find m’ ≠ m but g(m)=g(m’)
• Problems:
– not good for non-uniform models
– hard to connect to other assumptions
• Want a family of functions from which one is
selected
• Use the advantage we have: the target is known
Possible definitions
• A family of functions
G={g|g:{0,1}k → {0,1}h(k)}
Such that
• Easy to sample g from G and g  G has succinct description
• Given (k, g, x) easy to compute g(x)
•
h(k) < k
• Hard to find collisions:
Alternative 1 – any collision
– Given k and g  G hard to find x, x’  {0,1}k where
x ≠ x’ but g(x)=g(x’)
– Sometimes called collision intractable
– hard to connect to other assumptions
Alternative 2 – target collision
– Given (k,g,x) hard to find x’  {0,1}k where
x ≠ x’ but g(x)=g(x’)
Universal One-Way Hash functions
UOWHFs
•
•
•
When/how is the target x chosen?
Independently of g but want to work for any possible x
– First x is selected by adversary, then g  G is selected at random
Technical point: let ℓ1 , ℓ2 :{0,1}* → {0,1}* be functions mapping n to input and
output sizes. We assume
– ℓ1 (n) > ℓ2 (n) and
– both are bounded by polynomials in n
∞
Definition: A family of functions G= ⋃n=1 Gn where Gn ={g|g:{0,1}ℓ1(n) →{0,1}}ℓ2(n)} is
called (ℓ1, ℓ2 )-universal one-way hash if:
• Given n easy to sample random g from Gn and g  Gn has description polynomial in n
• Given (n, g, x) easy to compute g(x)
• Hard to find target collisions: no polynomial time adversary can on input n
– generate x  {0,1}ℓ1(n)
– given a random g  Gn find x’  {0,1}ℓ1(n) where
x ≠ x’ but g(x) = g(x’)
succeed with non-negligible probability for sufficiently large n
Pair-wise independent permutations
Definition: a family of permutations (1-1 functions)
H= {h| h: {0,1}n → {0,1}n }
is called Strongly Universal2 or pair-wise independent if:
– for all x1, x2 {0,1}n and y1, y2 {0,1}n where x1 ≠ x2 wand
y1 ≠ y2 we have
Prob[h(x1) = y1 and h(x2) = y2 ] = 1/2n ∙ 1/(2n-1)
Where the probability is over a randomly chosen h H
The same as in truly random permutations
In particular Prob[h(x2) = y2 | h(x1) = y1 ] = 1/(2n-1)
Construction: let F be a finite field F (e.g. GF[2n])
H= {ha,b(x) = a∙x + b | a, b  F, a ≠ 0 }
Constructing (n, n-1)-UOWHFs
• Idea: Combine one-way with universal
– Want to match each image of the one-way functions with another
random image
• Let f :{0,1}n → {0,1}n be a one-way permutation
• Let H = {h|h:{0,1}n → {0,1}n} be a Strongly
Universal2 family of permutations
• Let chopn-1 :{0,1}n → {0,1}n-1 be a 2-to-1 function
– E.g. chopping last bit of input
Consider the (n, n-1)-family G where each g G is
defined by h H
g(x) = chopn-1(h(f(x)))
Proof of Security
Want to construct from algorithm A which is target collision finding for G
an inversion algorithm B for f
Algorithm B:
• Input: y=f(z) to invert,
• Run algorithm A to get target x
• Find random h  H such that
chopn-1(h(y))= chopn-1(h(f(x)))
and give corresponding g as a challenge to A
•
y=f(z)
B
x
A
g
x’
– Why does such an h exist and how to find it?
If A finds x’ such that g(x’)=g(x) then
chopn-1(h(f(x))) = chopn-1(h(f(x’))) = chopn-1(h(y))
and y=f(x’) since h is 1-1
x’
What is the probability of success of B?
The same as the simulated collision algorithm A for G
Claim: the probability the simulated algorithm A witnesses is the same as the real A
Why does such an h exist and how to find it?
chopn-1(h(y))= chopn-1(h(f(x)))
•
•
•
•
•
Choose random w{0,1}n
let w’ be such that chopn-1(w)=chopn-1(w’)
Want h(y)=w and h(f(x))=w’
Such an h should exist from pair-wise independence
Easy to find and unique for
H= {ha,b(x) = a∙x + b | a, b  F, a ≠ 0 }
• Open problem(?): what happens to the security of the
construction if H does not have the property
Distribution of simulated A vs. real A
The difference between the simulated and real A:
• Real A gets g defined by random hH
• Simulated A chooses x and gets g defined by
– Choosing random z{0,1}n and computing y=f(z)
• y is uniform in {0,1}n from f being a permutation
– Choosing random w{0,1}n and finding random hH such that
h(y)=w and h(f(x))=w’
– Since both random y and random w are random the result is a
random hH
Simulated A and real A witness the same distribution
The probability that B inverts is the same as A finding a
collision
What about the reverse combination
Let f :{0,1}n → {0,1}n be a one-way permutation
Let H = {h|h:{0,1}n → {0,1}n} be a Strongly Universal2 family of
permutations
Consider the (n, n-1)-family G where each g G is defined by h H
g(x) = chopn-1(f(h(x)))
•
•
Is it a UOWHF?
Not necessarily: if
• h is easy to invert
and
• f does not affect the last bit
– not contradictory to either being one-way or a permutation
Then easy to find collisions: any x the that x’ collides under h will also collide under g
From (n, n-1)-UOWHFs to (n, n/2)UOWHFs
• Idea: composition.
• What happens to the security of the scheme?
– The probability of inverting f given a collision finding
algorithm for H may be small by a factor of 2/n
The Tree Construction
m
g1
g2
Let G be a (2k,k)-UOWHF
Let n= 2 ∙ l ∙ k. and t= log n/k. Each gi is chosen independently from G.
The result is a family of functions {0,1}n → {0,1}k which is (n,k)-UOWHF
Size of representation: t log |G| where t is the number of levels in the tree
g3
General construction (n, k)-UOWHFs
• Use tree composition
• Description length: k log (n/k) (n, n/2)descriptions of hash function
– 2k bits in the example
Can use the same tree for memory checking
Access pattern as before
UOWHF exist iff one-way functions exist
Checking other Data Structures
• Checking queues
• Checking stacks
– Can even be done in an online manner!
• Application: certificate revocation
– Have to check a look-up table
Conclusions
• If one-way functions exist, then on-line memory
checking is possible with
t(n) ¢ s(n) ¿ n
• Major open problem: obtain a more efficient solution
– break the log n factor
Simultaneous Messages Protocols
x
{0,1}n
ALICE
mA
CAROL
y
{0,1}n
BOB
• Suggested by Yao 1979
mB
f(x,y)
The simultaneous messages model:
• Alice receives x and Bob who receives inputs y
• They simultaneously send a message to a referee Carol
who initially get no input
• Carol should compute f(x,y)
Several possible models:
• Deterministic: all lower bounds for deterministic protocols
for f(x,y) are applicable here
• Shared (Public) random coins:
– Equality has a good protocol
• Consider public string as hash functions h,
• Alice sends h(x) Bob sends h(y) and Charlie compares the outcome
• The complexity can be as little as O(1) is after constant probability of error
Provided the random bits are chosen independently than the inputs
Simultaneous Messages Protocols
• For the equality function:
• There exists a protocol where
– |mA| x |mB| = O(n)
– Let C be a good error correcting code
– Alice and bob arrange C(x) and C(y)
in an |mA| x |mB| rectangle
• Alice sends a random row
• Bob send a random columns
• Carol compares the intersection
Simultaneous Messages Protocols
• Lower bounds for the equality function:
– |mA| + |mB| = (√n) [Newman Szegedy 1996]
– |mA| x |mB| = (n) [Babai Kimmel 1997]
• Idea: for each x 2 {0,1}n find a `typical’ multiset of messages
Tx = {w1, w2,…,wt} where
t 2 O(|mB|)
Each wi is a message in the original protocol, |mA| bits
Over
random i,
and
randomness
of Carol
Property: for each message mB the behavior on Tx approximates the real
behavior
Average behavior of Carol on w1, w2,…,wt is close to its average response
during protocol
Over randomness of
Alice and Carol
Simultaneous Messages Protocols
How to find for each x 2 {0,1}n such a `typical’ Tx of size t
• Claim: a random choice of wi’s is good
Proof by Chernoff
– Need to `take care’ of every mB
(2|mB| possibilities )
• Claim: for x  x’ we have Tx  Tx’
– Otherwise behaves the same when y = x for x and x’
• Let Sx be the mB’s for which protocol mostly says ’1’
• Let Wx be the mB’s for which protocol mostly says ’0’
• Then for y=x the distribution should be mostly on Sx
• Conclusion: t ¢ |mA| ¸ n and we get
|mA| x |mB| = (n)
[Babai Kimmel 1997]
General issue
• What do combinatorial lower bounds men when
complexity issues are involved?
• What happens to the pigeon-hole principal when
one-way functions (one-way hashing) are involved?
• Does the simultaneous message lower bound hold
when one-way functions exist
– Issue is complicated by the model
– Can define Consecutive Message Protocol model with
iff results
Consecutive Messages Protocols
ALICE
x
{0,1}n
mP
y
{0,1}n
mA
rp
f(x,y)
CAROL
mB
BOB
Theorem
For any CM protocol that computes the equality function,
If |mP| ≤ n/100 then |mA| x |mB| = (n)
Proof: along the
s(n)
t(n)
Babai Kimmel lines
Complexity considerations
• Suppose that
– One-way functions exist
– All parties are polynomial time
• Including the adversary
– Then sublinear CM protocols for equality exist
• The CM protocol:
Let H be a UOWHF
– Alice’s chooses h 2R H
Sends mA=h(x)
– Alice mp to deliver h
– Bob Sends mB=h(x)
– Carol accepts iff mA = mB
Easy to translate an adversary for the CM protocol to an adversary to the
UOWHFness of H
Sublinear CM protocols imply one-way
function
Theorem: a CM protocol for equality where
– all parties are polynomial time
– t(n) ¢ s(n) 2 o(n) and |mp| 2 o(n)
exists iff one-way function exist
Proof:
Consider the function
f(x,rA,rp,rB1,rB2, …, rBk) = (rp,mp,rB1,rB2, …, rBk ,mB1,mB2, …, mBk)
Where
• mp = Mp(x,rA,rP)
• mBi =MB(x,rBi,mp,rP)
Mp is the function that maps Alice’s
input to the public message mp
MB is the function that maps Bob’s
input to the private message mB
Main Lemma: the function f is distributionally one-way
CM protocol implies one-way functions
• Adversary selects a random x for Alice
• Alice sends public information mp, rpub
• Adversary generates a multiset Tx of s(n) Bob-messages
Claim: W.h.p., for every Alice message, Tx approximates Carol’s
output
• Adversary randomly inverts the function f and w.h.p. finds
x’x s.t.
Tx characterizes Carol when Bob’s input is both x and x’
Why?
Tx is of length much smaller than n
since s(n) ¢ t(n) + |mp| is not too large!
• Since on x and x’ where x’ x Carol’s behavior is similar in
both cases, the protocol cannot have high success probability