Sketching (1) Alex Andoni (Columbia University) MADALGO Summer School on Streaming Algorithms 2015

Download Report

Transcript Sketching (1) Alex Andoni (Columbia University) MADALGO Summer School on Streaming Algorithms 2015

Sketching (1)
Alex Andoni
(Columbia University)
MADALGO Summer School on Streaming Algorithms 2015
131.107.65.14
Challenge: log statistics of the data, using small space 18.0.1.12
131.107.65.14
IP
Frequency
131.107.65.14
3
18.0.1.12
2
80.97.56.20
2
127.0.0.1
9
192.168.0.1
8
257.2.5.7
0
16.09.20.11
1
80.97.56.20
18.0.1.12
80.97.56.20
131.107.65.14
Streaming statistics
๏ฝ
๏ฝ
Let ๐‘ฅ๐‘– = frequency of IP ๐‘–
1st moment (sum): โˆ‘๐‘ฅ๐‘–
๏ฝ
๏ฝ
Trivial: keep a total counter
2nd moment (variance): โˆ‘๐‘ฅ๐‘–2 = ||๐‘ฅ||2
๏ฝ
Trivially: ๐‘› counters โ†’ too much space
๏ฝ
๏ฝ
Canโ€™t do better
Better with small approximation!
๏ฝ
Via dimension reduction in โ„“2
IP
Frequency
131.107.65.14
3
18.0.1.12
2
80.97.56.20
2
โˆ‘๐‘ฅ๐‘– = 7
โˆ‘๐‘ฅ๐‘–2 = 17
2nd frequency moment
๏ฝ
๏ฝ
๏ฝ
Let ๐‘ฅ๐‘– = frequency of IP ๐‘–
2nd moment: โˆ‘๐‘ฅ๐‘–2 = ||๐‘ฅ||2
Dimension reduction
๏ฝ
๏ฝ
๐‘ฅ
๐‘† ๐‘ฅ = ๐บ1 ๐‘ฅ, ๐บ2 ๐‘ฅ, โ€ฆ ๐บ๐‘˜ ๐‘ฅ = ๐‘ฎ๐‘ฅ
each ๐บ๐‘– is n-dimensional Gaussian vector
Estimator:
๏ฝ
๏ฝ
๐‘ฎ
๐‘˜
Store a sketch of ๐‘ฅ
๏ฝ
๏ฝ
๐‘›
1
||๐‘ฎ๐‘ฅ||2
๐‘˜
=
1
๐‘˜
๐บ1 ๐‘ฅ
2
+ ๐บ2 ๐‘ฅ
2
+ โ‹ฏ + ๐บ๐‘˜ ๐‘ฅ
Updating the sketch:
๏ฝ
๏ฝ
Use linearity of the sketching function ๐‘†
๐‘ฎ ๐‘ฅ + ๐‘’๐‘– = ๐‘ฎ๐‘ฅ + ๐‘ฎ๐‘’๐‘–
2
Correctness
๏ฝ
1
โˆ’๐‘”2 /2
๐‘’
2๐œ‹
๐ธ[๐‘”] = 0
๐ธ[๐‘”2 ] = 1
Theorem [Johnson-Lindenstrauss]:
๏ฝ
๏ฝ
pdf =
||๐‘ฎ๐‘ฅ||2 = 1 ± ๐œ– ||๐‘ฅ||2 with probability 1 โˆ’ ๐‘’ โˆ’๐‘‚(๐‘˜๐œ–
2)
Why Gaussian?
๏ฝ
๏ฝ
Stability property: ๐บ๐‘– ๐‘ฅ = โˆ‘๐‘— ๐บ๐‘–๐‘— ๐‘ฅ๐‘— is distributed as | ๐‘ฅ| โ‹… ๐‘”,
where ๐‘” is also Gaussian
Equivalently: ๐บ๐‘– is centrally distributed, i.e., has random
direction, and projection on random direction depends only
on length of ๐‘ฅ
๐‘ƒ ๐‘Ž โˆ™๐‘ƒ ๐‘ =
1 โˆ’๐‘Ž2/2 1 โˆ’๐‘2/2
=
๐‘’
๐‘’
2๐œ‹
2๐œ‹
1 โˆ’(๐‘Ž2+๐‘2)/2
=
๐‘’
2๐œ‹
Proof [sketch]
๏ฝ
๏ฝ
๏ฝ
๏ฝ
๏ฝ
2
=๏… ๐‘ฅ
2
)
โ‹… ๐‘”2
1
๐‘˜
||๐‘ฅ|| โ‹… ๐‘”1 , โ€ฆ , ||๐‘ฅ|| โ‹… ๐‘”๐‘˜
where each ๐‘”๐‘– is distributed as 1D Gaussian
1
๐‘˜
Estimator: ||๐บ๐‘ฅ||2 = ||๐‘ฅ||2 โ‹… โˆ‘๐‘– ๐‘”๐‘–2
๏ฝ
๏ฝ
Expectation = ๏… ๐บ๐‘– โ‹… ๐‘ฅ
= ๐‘ฅ 2
2
๐‘ฎ๐‘ฅ is distributed as
๏ฝ
๏ฝ
Expectation: ๏… ๐บ๐‘– โ‹… ๐‘ฅ 2 = ๐‘ฅ 2
Standard deviation: ๏ณ[|๐บ๐‘– ๐‘ฅ|2 ] = ๐‘‚( ๐‘ฅ
Proof:
โˆ‘๐‘– ๐‘”๐‘–2 is called chi-squared distribution with ๐‘˜ degrees
Fact: chi-squared very well concentrated:
๏ฝ
๏ฝ
1
โˆ’๐‘”2 /2
๐‘’
2๐œ‹
๐ธ[๐‘”] = 0
๐ธ[๐‘”2 ] = 1
Claim: for any ๐‘ฅ โˆˆ โ„œ๐‘› , we have
๏ฝ
๏ฝ
pdf =
Equal to 1 + ๐œ– with probability 1 โˆ’ ๐‘’ โˆ’ฮฉ(๐œ–
Akin to central limit theorem
2 ๐‘˜)
2nd frequency moment: overall
๏ฝ
Correctness:
๏ฝ
๏ฝ
๏ฝ
2
โˆ’๐‘‚(๐‘˜๐œ– 2 )
||๐‘ฎ๐‘ฅ|| = 1 ± ๐œ– ||๐‘ฅ|| with probability 1 โˆ’ ๐‘’
Enough to set ๐‘˜ = ๐‘‚(1/๐œ– 2 ) for const probability of success
Space requirement:
๏ฝ
๏ฝ
๏ฝ
2
๐‘˜ = ๐‘‚(1/๐œ– 2 ) counters of ๐‘‚(log ๐‘›) bits
What about ๐‘ฎ: store ๐‘‚(๐‘›๐‘˜) reals ?
Storing randomness [AMSโ€™96]
๏ฝ
๏ฝ
๏ฝ
Ok if ๐‘”๐‘– โ€œless randomโ€: choose each of them as 4-wise
independent
Also, ok if ๐‘”๐‘– is a random ±1
Only ๐‘‚(๐‘˜) counters of ๐‘‚ log ๐‘› bits
More efficient sketches?
๏ฝ
Smaller Space:
๏ฝ
๏ฝ
No: ฮฉ
1
log ๐‘›
๐œ–2
bits [JWโ€™11] โ† Davidโ€™s lecture
Faster update time:
๏ฝ
Yes: Jelaniโ€™s lecture
Streaming Scenario 2
131.107.65.14
80.97.56.20
๐‘ฅ
18.0.1.12
18.0.1.12
๐‘ฆ
IP
Frequency
IP
Frequency
131.107.65.14
1
131.107.65.14
1
18.0.1.12
1
18.0.1.12
2
80.97.56.20
1
Focus: difference in traffic
1st moment: โˆ‘ |๐‘ฅ๐‘– โ€“ ๐‘ฆ๐‘– | = ๐‘ฅ โˆ’ ๐‘ฆ 1
2nd moment: โˆ‘ |๐‘ฅ๐‘– โ€“ ๐‘ฆ๐‘– |2 = ๐‘ฅ โˆ’ ๐‘ฆ
2
2
๐‘ฅโˆ’๐‘ฆ
๐‘ฅโˆ’๐‘ฆ
1
2
2
=2
=2
Similar Qs: average delay/variance in a network
differential statistics between logs at different servers, etc
Definition: Sketching
๏ฝ
Sketching:
๏ฝ
๏ฝ
๐‘† : objects โ†’ short bit-strings
given ๐‘†(๐‘ฅ) and ๐‘†(๐‘ฆ), should be able to estimate some function
of ๐‘ฅ and ๐‘ฆ
IP
Frequency
IP
Frequency
131.107.65.14 1
131.107.65.14 1
18.0.1.12
18.0.1.12
1
80.97.56.20
1
2
๐‘ฆ
๐‘ฅ
๐‘†
๐‘†
010110
Estimate ๐‘ฅ โˆ’ ๐‘ฆ
010101
2
2
?
Sketching for โ„“2
๏ฝ
As before, dimension reduction
๏ฝ
๏ฝ
๏ฝ
Pick ๐‘ฎ (using common randomness)
๐‘†(๐‘ฅ) = ๐‘ฎ๐‘ฅ
Estimator: ||๐‘†(๐‘ฅ) โˆ’ ๐‘†(๐‘ฆ)||22 = ||๐‘ฎ ๐‘ฅ โˆ’ ๐‘ฆ ||22
IP
Frequency
IP
Frequency
131.107.65.14 1
131.107.65.14 1
18.0.1.12
18.0.1.12
1
80.97.56.20
1
2
๐‘ฆ
๐‘ฅ
๐‘†
๐‘†
010110
๐บ๐‘ฅ
๐บ๐‘ฆ
010101
||๐บ๐‘ฅ โˆ’ ๐บ๐‘ฆ||22
Sketching for Manhattan distance (โ„“1 )
๏ฝ
Dimension reduction?
๏ฝ
Essentially no: [CSโ€™02, BCโ€™03, LNโ€™04, JNโ€™10โ€ฆ]
๏ฝ
For ๐‘› points, ๐ท approximation: between ๐‘›ฮฉ
[BC03, NR10, ANN10โ€ฆ]
๏ฝ
1/๐ท2
even if map depends on the dataset!
๏ฝ
In contrast: [JL] gives ๐‘‚(๐œ– โˆ’2 log ๐‘›)
No distributional dimension reduction either
๏ฝ
Weak dimension reduction is the rescueโ€ฆ
๏ฝ
and ๐‘‚(๐‘›/๐ท)
Dimension reduction for โ„“1 ?
๏ฝ
๏ฝ
Can we do the โ€œanalogโ€ of Euclidean projections?
For โ„“2 , we used: Gaussian distribution
๏ฝ
๏ฝ
๏ฝ
Is there something similar for 1-norm?
๏ฝ
๏ฝ
๏ฝ
๏ฝ
has stability property:
๐‘”1 ๐‘ง1 + ๐‘”2 ๐‘ง2 + โ‹ฏ ๐‘”๐‘‘ ๐‘ง๐‘‘ is distributed as ๐‘” โ‹… ||๐‘ง||
Yes: Cauchy distribution!
1-stable:
๐‘1 ๐‘ง1 + ๐‘2 ๐‘ง2 + โ‹ฏ ๐‘๐‘‘ ๐‘ง๐‘‘ is distributed as ๐‘ โ‹… ||๐‘ง||1
Whatโ€™s wrong then?
๏ฝ
๏ฝ
Cauchy are heavy-tailedโ€ฆ
doesnโ€™t even have finite expectation (of abs)
๐‘๐‘‘๐‘“ ๐‘  =
1
๐œ‹(๐‘  2 + 1)
Sketching for โ„“1 [Indykโ€™00]
๏ฝ
Still, can consider map as before
๏ฝ
๏ฝ
๐‘† ๐‘ฅ = ๐ถ1 ๐‘ฅ, ๐ถ2 ๐‘ฅ, โ€ฆ , ๐ถ๐‘˜ ๐‘ฅ = ๐‘ช๐‘ฅ
Consider ๐‘† ๐‘ฅ โˆ’ ๐‘† ๐‘ฆ = ๐‘ช๐‘ฅ โˆ’ ๐‘ช๐‘ฆ = ๐‘ช ๐‘ฅ โˆ’ ๐‘ฆ = ๐‘ช๐‘ง
๏ฝ
๏ฝ
๏ฝ
where ๐‘ง = ๐‘ฅ โˆ’ ๐‘ฆ
each coordinate distributed as ||๐‘ง||1 ×Cauchy
Take 1-norm ||๐‘ช๐‘ง||1 ?
๏ฝ
๏ฝ
Can estimate ||๐‘ง||1 by:
๏ฝ
๏ฝ
does not have finite expectation, butโ€ฆ
Median of absolute values of coordinates of ๐‘ช๐‘ง !
Correctness claim: for each ๐‘–
๏ฝ
๏ฝ
14
Pr ๐ถ๐‘– ๐‘ง > ||๐‘ง||1 โ‹… (1 โˆ’ ๐œ–) > 1/2 + ฮฉ(๐œ–)
Pr ๐ถ๐‘– ๐‘ง < ||๐‘ง||1 โ‹… (1 + ๐œ–) > 1/2 + ฮฉ(๐œ–)
Estimator for โ„“1
๏ฝ
๏ฝ
Estimator: median ๐ถ1 ๐‘ง , ๐ถ2 ๐‘ง , โ€ฆ ๐ถ๐‘˜ ๐‘ง
Correctness claim: for each ๐‘–
๏ฝ
๏ฝ
๏ฝ
Pr ๐ถ๐‘– ๐‘ง > ||๐‘ง||1 โ‹… (1 โˆ’ ๐œ–) > 1/2 + ฮฉ(๐œ–)
Pr ๐ถ๐‘– ๐‘ง < ||๐‘ง||1 โ‹… (1 + ๐œ–) > 1/2 + ฮฉ(๐œ–)
Proof:
๏ฝ
๏ฝ
๐ถ๐‘– ๐‘ง = ๐‘Ž๐‘๐‘ (๐ถ๐‘– ๐‘ง) is distributed as abs ||๐‘ง||1 ๐‘ = ||๐‘ง||1 โ‹… |๐‘|
Easy to verify that
๏ฝ
๏ฝ
๏ฝ
Pr ๐‘ > 1 โˆ’ ๐œ–
Pr ๐‘ < 1 + ๐œ–
> 1/2 + ฮฉ ๐œ–
> 1/2 + ฮฉ ๐œ–
Hence, if we have ๐‘˜ = ๐‘‚ 1/๐œ– 2
๏ฝ
median ๐ถ1 ๐‘ง , ๐ถ2 ๐‘ง , โ€ฆ ๐ถ๐‘˜ ๐‘ง โˆˆ 1 ± ๐œ– ||๐‘ง||1
with probability at least 90%
To finish the โ„“๐‘ normsโ€ฆ
๐‘
๏ฝ
๐‘-moment: ฮฃ๐‘ฅ๐‘– = ๐‘ฅ
๏ฝ
๐‘โ‰ค2
๏ฝ
๏ฝ
๐‘
๐‘
works via ๐‘-stable distributions [Indykโ€™00]
๐‘>2
๏ฝ
Can do (and need) ๐‘‚(๐‘›1โˆ’2/๐‘ ) counters
[AMSโ€™96, SSโ€™02, BYJKSโ€™02, CKSโ€™03, IWโ€™05, BGKSโ€™06, BO10, AKOโ€™11, Gโ€™11,
BKSVโ€™14]
๏ฝ
Will see a construction via Precision Sampling
A task: estimate sum
๏ฝ
Given: ๐‘› quantities ๐‘Ž1 , ๐‘Ž2 , โ€ฆ ๐‘Ž๐‘› in the range [0,1]
Goal: estimate ๐‘† = ๐‘Ž1 + ๐‘Ž2 + โ‹ฏ ๐‘Ž๐‘› โ€œcheaplyโ€
๏ฝ
Standard sampling: pick random set ๐ฝ = {๐‘—1, โ€ฆ ๐‘—๐‘š} of size ๐‘š
๏ฝ
๏ฝ
๏ฝ
๏ฝ
Estimator: ๐‘† =
๐‘›
๐‘š
โ‹… (๐‘Ž๐‘—1 + ๐‘Ž๐‘—2 + โ‹ฏ ๐‘Ž๐‘—๐‘š )
Chebyshev bound: with 90% success probability
1
๐‘† โ€“ ๐‘‚(๐‘›/๐‘š) < ๐‘† < 2๐‘† + ๐‘‚(๐‘›/๐‘š)
2
For constant additive error, need ๐‘š = ฮฉ(๐‘›)
Compute an estimate ๐‘† from ๐‘Ž1, ๐‘Ž3
a3
a1
a1
a2
a3
a4
Precision Sampling Framework
๏ฝ
Alternative โ€œaccessโ€ to ๐‘Ž๐‘– โ€™s:
๏ฝ
๏ฝ
๏ฝ
For each term ๐‘Ž๐‘– , we get a (rough) estimate ๐‘Ž๐‘–
up to some precision ๐‘ข๐‘– , chosen in advance: |๐‘Ž๐‘– โ€“ ๐‘Ž๐‘– | < ๐‘ข๐‘–
Challenge: achieve good trade-off between
๏ฝ
๏ฝ
quality of approximation to ๐‘†
use only weak precisions ๐‘ข๐‘– (minimize โ€œcostโ€ of estimating ๐‘Ž)
Compute an estimate ๐‘† from ๐‘Ž1 , ๐‘Ž2 , ๐‘Ž3 , ๐‘Ž4
u1
a1
aฬƒ1
u2
a2
aฬƒ2
u3
aฬƒ3
a3
u4
aฬƒ4
a4
Formalization
Sum Estimator
Adversary
1. fix precisions ๐‘ข๐‘–
1. fix ๐‘Ž1, ๐‘Ž2, โ€ฆ ๐‘Ž๐‘›
3. given ๐‘Ž1 , ๐‘Ž2 , โ€ฆ ๐‘Ž๐‘› , output ๐‘† s.t.
โˆ‘๐‘– ๐‘Ž๐‘– โˆ’ ๐›พ๐‘† < 1 (for some small ๐›พ)
๏ฝ
What is cost?
๏ฝ
๏ฝ
๏ฝ
2. fix ๐‘Ž1 , ๐‘Ž2 , โ€ฆ ๐‘Ž๐‘› s.t. |๐‘Ž๐‘– โˆ’ ๐‘Ž๐‘– | < ๐‘ข๐‘–
Here, average cost = 1/๐‘› โ‹… โˆ‘ 1/๐‘ข๐‘–
to achieve precision ๐‘ข๐‘–, use 1/๐‘ข๐‘– โ€œresourcesโ€: e.g., if ๐‘Ž๐‘– is itself a sum ๐‘Ž๐‘– =
โˆ‘๐‘—๐‘Ž๐‘–๐‘— computed by subsampling, then one needs ฮ˜(1/๐‘ข๐‘–) samples
For example, can choose all ๐‘ข๐‘– = 1/๐‘›
๏ฝ
Average cost โ‰ˆ ๐‘›
Precision Sampling Lemma
[A-Krauthgamer-Onakโ€™11]
๏ฝ
๏ฝ
Goal: estimate โˆ‘๐‘Ž๐‘– from {๐‘Ž๐‘– } satisfying |๐‘Ž๐‘– โˆ’ ๐‘Ž๐‘– | < ๐‘ข๐‘– .
Precision Sampling Lemma: can get, with 90% success:
๏ฝ
๐œ– additive error and1 +
O(1)
1.5๐œ– multiplicative error:
๐‘† ๐‘†โˆ’โˆ’๐‘‚๐œ–1< <
๐‘† ๐‘†<< 11.5
+ ๐œ–โ‹… ๐‘†๐‘†++๐‘‚(1)
๐œ–
๏ฝ
๏ฝ
with average cost equal to O(log
๐‘‚(๐œ– โˆ’3n)
log ๐‘›)
Example: distinguish ฮฃ๐‘Ž๐‘– = 3 vs ฮฃ๐‘Ž๐‘– = 0
๏ฝ
Consider two extreme cases:
๏ฝ
if three ๐‘Ž๐‘– = 1: enough to have crude approx for all (๐‘ข๐‘– = 0.1)
if all ๐‘Ž๐‘– = 3/๐‘›: only few with good approx ๐‘ข๐‘– = 1/๐‘›, and the rest with
๐‘ข๐‘– = 1
Precision Sampling Algorithm
๏ฝ
Precision Sampling Lemma: can get, with 90% success:
๏ฝ
1 + ๐œ–multiplicative error:
๐œ– additive error and 1.5
O(1)
๐‘† ๐‘†โˆ’โˆ’๐‘‚๐œ– 1< <
๐‘† <11.5
๐‘‚(1)
๐‘†<
+ ๐œ–โ‹… ๐‘†โ‹… +
๐‘†+
๐‘‚ 1
๏ฝ
๏ฝ
Algorithm:
๏ฝ
๏ฝ
๏ฝ
with average cost equal to ๐‘‚(log
๐‘‚(๐œ– โˆ’3๐‘›)
log ๐‘›)
concrete
Choose each ๐‘ข๐‘– ๏ƒŽ[0,1]
i.i.d. distrib. = minimum of ๐‘‚(๐œ– โˆ’3 ) u.r.v.
+
Estimator: ๐‘† = count
number
of
๐‘–โ€˜s
s.t.
๐‘Ž
/
๐‘ข
> 6๐‘ข๐‘– โ€™s(up to a
function of [๐‘Ž๐‘– /๐‘ข๐‘– โˆ’ 4/๐œ–]
๐‘–
๐‘– and
normalization constant)
Proof of correctness:
๏ฝ
๏ฝ
๏ฝ
we use only ๐‘Ž๐‘– which are 1.5-approximation to ๐‘Ž๐‘–
๐ธ[๐‘†] โ‰ˆ โˆ‘ Pr[๐‘Ž๐‘– / ๐‘ข๐‘– > 6] = โˆ‘ ๐‘Ž๐‘– /6.
๐ธ[1/๐‘ข๐‘– ] = ๐‘‚(log ๐‘›) w.h.p.
โ„“๐‘ via precision sampling
๏ฝ
๏ฝ
Theorem: linear sketch for โ„“๐‘ with ๐‘‚(1) approximation,
and ๐‘‚(๐‘›1โˆ’2/๐‘ log ๐‘›) space (90% succ. prob.).
Sketch:
๏ฝ
๏ฝ
๏ฝ
๏ฝ
๏ฝ
Estimator:
๏ฝ
๏ฝ
๏ฝ
Pick random ๐‘Ÿ๐‘–๏ƒŽ{±1}, and ๐‘ข๐‘– as exponential r.v.
1/๐‘
let ๐‘ฆ๐‘– = ๐‘ฅ๐‘– โ‹… ๐‘Ÿ๐‘– /๐‘ข๐‘–
throw into one hash table ๐ป, ๐‘ฅ = ๐‘ฅ1 ๐‘ฅ2 ๐‘ฅ3
๐‘˜ = ๐‘‚(๐‘›1โˆ’2/๐‘ log ๐‘›) cells
max ๐ป ๐‘
๐‘
๐‘
๐‘ฆ๐Ÿ
๐‘ฆ๐Ÿ’
๐ป = + ๐‘ฆ๐Ÿ‘
Linear: works for difference as well
Randomness: bounded independence suffices
๐‘ข โˆผ ๐‘’ โˆ’๐‘ข
๐‘ฅ4
๐‘ฆ๐Ÿ
+ ๐‘ฆ๐Ÿ“
+ ๐‘ฆ๐Ÿ”
๐‘ฅ5
๐‘ฅ6
Correctness of โ„“๐‘ estimation
๏ฝ
Sketch:
1/๐‘
where ๐‘Ÿ๐‘–๏ƒŽ{±1}, and ๐‘ข๐‘– exponential r.v.
๏ฝ
๐‘ฆ๐‘– = ๐‘ฅ๐‘– โ‹… ๐‘Ÿ๐‘– /๐‘ข๐‘–
๏ฝ
Throw into hash table ๐ป
๐‘
๏ฝ
Theorem: max ๐ป ๐‘
๏ฝ
probability, for ๐‘˜ = ๐‘‚(๐‘›1โˆ’2/๐‘ log ๐‘‚ 1 ๐‘›) cells
Claim 1: max |๐‘ฆ๐‘– | is a const approx to ||๐‘ฅ||๐‘
๐‘
๐‘
is ๐‘‚(1) approximation with 90%
๐‘–
= max ๐‘ฅ๐‘– ๐‘ /๐‘ข๐‘–
๏ฝ
max ๐‘ฆ๐‘–
๏ฝ
๏ฝ
Fact [max-stability]: max ๐œ†๐‘– /๐‘ข๐‘– distributed as โˆ‘๐œ†๐‘– /๐‘ข
๐‘
max ๐‘ฆ๐‘– ๐‘ is distributed as ||๐‘ฅ||๐‘ /๐‘ข
๏ฝ
๐‘ข is ฮ˜(1) with const probability
๐‘–
๐‘–
๐‘–
๐‘ฅ1
๐‘ฅ2
๐‘ฅ3
๐‘ฅ4
๐‘ฅ5
๐‘ฅ6
Correctness (cont)
๏ฝ
Claim 2:
๏ฝ
๏ฝ
for ๐‘– โˆ— which maximizes |๐‘ฆ๐‘– โˆ— |
How much โ€œextra stuffโ€ is there?
1/๐‘
๐‘ฆ๐‘– = ๐‘ฅ๐‘– โ‹… ๐‘Ÿ๐‘– /๐‘ข๐‘–
where ๐‘Ÿ๐‘– ๏ƒŽ{±1}
๐‘ข๐‘– exponential r.v.
๏ฝ
๏ค2 = (๐ป[๐‘] โˆ’ ๐‘ฆ๐‘– โˆ— )2 = (โˆ‘๐‘—โ‰ ๐‘– โˆ— ๐‘ฆ๐‘— โ‹… ๏ฃ[๐‘—๏‚ฎ๐‘])2
๏ฝ
๐ธ ๏ค2 = โˆ‘๐‘—โ‰ ๐‘– โˆ— ๐‘ฆ๐‘—2 โ‹… ๏ฃ[๐‘—๏‚ฎ๐‘] = โˆ‘๐‘—โ‰ ๐‘– โˆ— ๐‘ฆ๐‘—2 /๐‘˜ โ‰ค ||๐‘ฆ||2 /๐‘˜
๏ฝ
We have: ๐ธ๐‘ข ||๐‘ฆ||2 โ‰ค ||๐‘ฅ||2 โ‹… ๐ธ 1/๐‘ข1/๐‘ = ๐‘‚ log ๐‘› โ‹… ||๐‘ฅ||2
๏ฝ
||๐‘ฅ||2 โ‰ค ๐‘›1โˆ’2/๐‘ ||๐‘ฅ||2๐‘
๏ฝ
By Markovโ€™s: ๏ค2 โ‰ค ||๐‘ฅ||2๐‘ โ‹… ๐‘›1โˆ’2/๐‘ โ‹… ๐‘‚(log ๐‘›)/๐‘˜ with prob 0.9.
Then: ๐ป[๐‘] = ๐‘ฆ๐‘– โˆ— + ๐›ฟ = ๏‘ 1 โ‹… ||๐‘ฅ||๐‘ .
๏ฝ
๏ฝ
๐‘
๐‘ฆ๐Ÿ
+ ๐‘ฆ๐Ÿ“
+ ๐‘ฆ๐Ÿ”
Consider a hash table ๐ป, and the cell ๐‘ where ๐‘ฆ๐‘– โˆ— falls into
๏ฝ
๏ฝ
max |๐ป ๐‘ | = ๏‘ 1 โ‹… ||๐‘ฅ||๐‘
๐‘ฆ๐Ÿ
๐‘ฆ๐Ÿ’
๐ป = + ๐‘ฆ๐Ÿ‘
Need to argue about other cells too โ†’ concentration