Secure Indexes
Download
Report
Transcript Secure Indexes
Secure Indexes
Author:Eu-Jin Goh
Presented by Yi Cheng Lin
1
Outline
Introduction
Contribution
Index Scheme
Background
Construction
Choosing Suitable Bloom Filter
Parameter
2
Outline
Pseudo-Random Functions
IND-CKA
Z-IDX is a IND-CKA index
Conclusion
Comment
3
Introduction
Keyword indexes let us search in
constant time for documents containing
specified keywords
Unfortunately, standard index
constructions such as those using hash
table are unsuitable for indexing
encrypted documents
4
Introduction
In this paper, they formally define a
secure index that allows a querier with
a “trapdoor” for a word x to test in O (1)
time only if the index contains x
The index reveals no information about
its constants without valid trapdoors
5
Contribution
Knows m
words
The first contribution of this paper is in
defining a secure index and formulating
a security model for indexes known as
semantic security against adaptive
chosen keyword attack (IND-CKA) n words
n-m
unknown
wotrds
adversary A
Can’t get
any word
index
document D
6
Contribution
The second contribution is an efficient
IND-CKA secure index construction
called Z-IDX, which is built using
pseudo-random functions and Bloom
filters
Z-IDX scheme is efficient
7
Contribution
27.4 megabytes
2654
plaintext
files
an index for the average document is
roughly 121.4 kilobytes in size
The largest document in this
collection is 876.6 kilobytes long and
its index is 774.3 kilobytes large
The smallest document is 9 bytes
long and its index is 115 bytes large
Debian Linux
15151 indexes can searched in one second
on a 866 MHz Pentium 3 machine
8
Index Scheme
Keygen (s): Given a security parameter
s, outputs the master private key Kpriv
Trapdoor (Kpriv, w): Given the master
key Kpriv and word w, outputs the
trapdoor Tw for w
9
Index Scheme
BuildIndex (D, Kpriv): Given a document
D and the master key Kpriv, outputs the
index ID
SearchIndex(Tw, ID): Given the trapdoor
Tw for word w and the index ID for
document D, outputs 1 if w D and 0
otherwise
10
Index Scheme
Store
ID1, E(D1)
Alice
Keygen (s): Kpriv
BuildIndex (D1, Kpriv): ID1
Server
Index
ID 1
ID 2
…
Encrypted data
E(D1)
E(D2)
…
11
Index Scheme
Tw
E(D1), …
Alice
Keygen (s): Kpriv
Trapdoor (Kpriv, w): Tw
Server
SearchIndex(Tw, ID1)
ID 1
E(D1)
1
ID 2
E(D2)
…
0
…
…12
Background
pseudo-random functions :is
computationally indistinguishable from a
random function
given pairs (x1, f(x1, k)), . . . , (xm, f(xm,
k)), an adversary cannot predict f(xm+1,
k) for any xm+1
13
Background
Bloom Filter: a set of S = {s1, . . . , sn}
of n elements and is represented by an
array of m bits.
All array bits are initially set to 0. The
filter uses r independent hash functions
h1, . . . , hr, where hi : {0, 1}* ->[1,m]
for i [1, r].
14
To determine if an element
a belongs to the set S
a
h 1 ( a)
h 2 ( a)
.
.
.
hr(a)
S
If all bit are 1’s,then a
Else a
S
15
Construction
Keygen(s): Given a security parameter s,
choose a pseudo-random function f : {0,
1}n×{0, 1}s {0, 1}s and the master key
Kpriv = (k1, . . . , kr) R {0, 1}sr
Trapdoor(Kpriv,w): Given the master key
Kpriv = (k1, . . . , kr){0, 1}sr and word w,
output the trapdoor for word w as Tw =
(f(w, k1) , . . . , f(w, kr)) {0, 1}sr
16
Construction
Input
BuildIndex(D,Kpriv):
Document D : Did
{0, 1}n
A list of words (w0, . . . ,wt)
Kpriv = (k1, . . . , kr)
trapdoor
Wi
{0, 1}
nt
{0, 1}
sr
Output IDid = (Did, BF)
x1 = f (wi , k1)
...
xr = f (wi , kr)
codeword
y1 = f (Did , x1)
...
BF for
Did
yr = f (Did , xr)
17
Construction
SearchIndex(Tw, IDid):
Input trapdoor Tw = (x1,…, xr)
index
y1 = f
...
{0, 1}
IDid = (Did , BF) for document Did
Test if BF contains 1’s in all
(Did , x1) r locations denoted by
y1, . . . , yr
yr = f (Did , xr)
sr
If so, output 1;
Otherwise, output 0
18
Choosing Suitable Bloom Filter
Parameter
Hash functions h1,…., hr
Insert n distinct element in to an array
of size m
The probability that bit i in the array is
0 is (1 – (1/m))rn ≈ e-rn/m
the probability of a false positive is (1 −
(1 − (1/m))rn)r ≈ (1 − e−rn/m)r
19
Choosing Suitable Bloom Filter
Parameter
False positive rate
fp = (1/2)r = (1 − e−rn/m)r
½ = 1 − e−rn/m
−rn/m
½ = e
ln(1/2) = -rn/m
ln 2 = r (n/m)
m = rn/ ln 2
20
Choosing Suitable Bloom Filter
Parameter
Choose suitable m
fp = 0.01
r=7
fp = 0.001
r = 10
n = 1000
10102
14431
n = 10000
101011
144301
21
Pseudo-Random Functions
f : {0, 1}n × {0, 1}s ->{0, 1}m is a (t, ɛ,
q)-pseudo-random function if for any t
time oracle algorithm A that makes at
most q adaptive queries
22
IND-CKA
Setup :
Challenger C
creates a set S of q
words
C build index
for each subset
in S*
Queries :
S
S*
Index
Adversary A
Chooses a number
of subsets from S
This collection of
subset is called S*
Query C on a word x
Trapdoor Tx for x
23
IND-CKA
Challenge :
A picks a non-empty subset V0 S*, and generating another
non-empty subset V1 from S such that |V0 −
V1| 0, |V1 − V0| 0, and the total length of words in V0 is
equal to that in V1
Next, A gives V0 and V1 to C who chooses b {0,1}, invokes
BuildIndex(Vb , Kpriv) to obtain the index IVb for Vb , and return
IVb to A
24
IND-CKA
Response :A eventually output a bit b’,
representing its guess for b
The advantage of A in winning this game is
defined as AdvA = | Pr[b = b’] − 1/2|
We say that an adversary A (t, ɛ, q)-breaks
an index if AdvA is at least ɛ after A takes at
most t time and makes q trapdoor queries to
the challenger. We say that I is an (t, ɛ, q)IND-CKA secure index if no adversary can (t,
ɛ, q)-break it
AdvA = | Pr[b = b’] − 1/2|< ɛ
25
Z-IDX is a IND-CKA index
Theorem 3.2.
If f is a (t, ɛ, q)-pseudo-random
function, then Z-IDX is a (t, ɛ, q/2)IND-CKA index
We use ¬q -> ¬p to prove
26
Z-IDX is a IND-CKA index
Prove :Suppose Z-IDX is not a (t, ɛ,
q/2)- IND-CKA index
algorithm A
(t, ɛ, q/2)-breaks Z-IDX
We build an algorithm B that
uses A to determine if f is a pseudo-random function
or a random function.
the unknown function f that takes as input x {0, 1}n
and returns f (x){0, 1}s.
27
Z-IDX is a IND-CKA index
Setup :
algorithm B
creates a set S of q/2
words
B build index
for each subset
in S*
Queries :
S
S*
Index
algorithm A
Chooses a number
of subsets from S
This collection of
subset is called S*
Query B on a word x
Trapdoor Tx for x
28
Z-IDX is a IND-CKA index
Response : A eventually outputs a bit b’,
representing its guess for b. If b’ = b, then B
outputs 0, indicating that it guesses that f is a
pseudo-random function. Otherwise, B
outputs 1
B takes at most t time because A takes at
most t time. Furthermore, B makes at most q
queries to f because there are only q/2
strings in S and A makes at most q/2 queries
29
Z-IDX is a IND-CKA index
Claim 1: When f is a pseudo-random
function, then
Claim 2: When f is a random function,
then
30
Z-IDX is a IND-CKA index
By claim1 and claim 2
But, if f is a (t, ɛ, q)-pseudo-random function
Theorem 3.2.
If f is a (t, ɛ, q)-pseudo-random function,
then Z-IDX is a (t, ɛ, q/2)- IND-CKA index
31
Conclusion
Z-IDX is efficient for search indexes
Index and document’s size are
independent
Property : ”hidden queries”, “controlled
searching”, and “query isolation”
32
Comment
Bloom Filter is a probabilistic data
structure
Need more space
(index’s size ≈ document’s size)
33