talagrand-ias - IAS Video Lectures
Download
Report
Transcript talagrand-ias - IAS Video Lectures
Talagrandβs convolution conjecture and
anti-concentration of temperature
James R. Lee
University of Washington
Joint work with Ronen Eldan [MSR, University of Washington,β¦]
noise and smoothness
π βΆ β1,1
π
ββ
The βheat flowβ operator ππ βΆ πΏ2 β1,1
is defined, for π β [0,1], by:
π
β πΏ2 β1,1
π
ππ π π₯ = πΌ π π₯1π , π₯2π , β¦ , π₯ππ
Where π₯ππ = π₯π with probability 1 β π and π₯ππ = ±1
with probability π/2 each, independently for each π.
General principle:
π smooth β ππ π smoother
Many applications:
PCPs, hardness of approximation, social choice,
circuit complexity, information theory,
learning, data structures, ...
noise and smoothness
Hypercontractive inequality [Bonami, Gross, Nelson]:
For π > 0:
ππ βΆ πΏπ β1,1 π β πΏπ β1,1 π
is a contraction for some π > π > 1.
π
π
small for π > 1 β ππ π
π
small for π > π
If π = indicator of a set, then this is encodes
βsmall set expansion.β
noise and smoothness
Relative entropy: For π βΆ β1,1
π
β β+ with πΌ π = 1,
Ent(π) = πΌ π log π = πΌπβΌπ [log π π ]
Gradient: πΌ π»π
2
=
π
π=1 πΌ[
π π₯ β ππ β π π₯
Log-Sobolev inequality: For every such π βΆ β1,1
Ent(π) β€ πΌ π» π
π
2
]
β β+
2
π
β β Ent(ππ π)
ππ
π=0
noise and hypercontractivity
Hypercontractive inequality:
For π > 0 and π > 1, there is a π > π such that
ππ π
for all π βΆ β1,1
Log-Sobolev inequality:
π
π
β€ π
π
β β.
Ent(π) β€ πΌ π» π
2
Talagrand (1989): What about smoothing for arbitrary π?
the convolution conjecture
π βΆ β1,1
π
β β+ and πΌ π = 1
Markovβs inequality: β π β₯ πΌ β€
1
πΌ
(tight for π = scaled indicator on a set of measure 1/πΌ)
Convolution conjecture [Talagrand 1989, $1,000]:
For every π > 0, there exists π: β+ β β+ so that for every π,
β ππ π β₯ πΌ β€
π πΌ
πΌ
and π πΌ β 0 as πΌ β β
- Best function is probably π πΌ βΌ
1
log πΌ
(achieved for halfspaces)
- Conjecture unresolved for any fixed π > 0, π = ππ , π β {β1,1}π
anti-concentration of temperature
Convolution conjecture [Talagrand 1989]:
For every π > 0, there exists π: β+ β β+ so that for every π,
β ππ π β₯ πΌ β€
π πΌ
πΌ
and π πΌ β 0 as πΌ β β
Equivalent to the conjecture that
πΌ ππ π 1 ππ πβ πΌ,2πΌ β€ π πΌ
βTemperatureβ cannot concentrate at a single (high) level
the Gaussian case
π βΆ βπ β β+ and πΌ π = β« πππΎπ = 1 (gaussian measure)
Let π΅π‘ be an π-dimensional Brownian motion with π΅π‘ = 0.
the Gaussian case
π βΆ βπ β β+ and πΌ π = β« πππΎπ = 1 (gaussian measure)
Let π΅π‘ be an π-dimensional Brownian motion with π΅0 = 0.
Brownian semi-group: ππ‘ π π₯ = πΌ[π π₯ + π΅π‘ ]
π0 = identity map
π1 = standard π-dim. Gaussian avg
the Gaussian case
π βΆ βπ β β+ and πΌ π = β« π ππΎπ = 1
Brownian semi-group: ππ‘ π π₯ = πΌ[π π₯ + π΅π‘ ]
Gaussian convolution conjecture: For every π‘ < 1, there is a
π: β+ β β+ so that for every π,
β π1βπ‘ π π΅π‘ β₯ πΌ β€
-
π πΌ
πΌ
and π πΌ β 0 as πΌ β β
Special case of discrete cube conjecture
Previously unknown for any π‘
π = 1 is an exercise
True in any fixed dimension
[Ball, Barthe, Bednorz, Oleszkiewicz, and Wolff, 2010]
isoperimetric (dual) version
Fix π‘ > 0 and consider a subset π β βπ .
Consider π βΆ βπ β β+ supported on π
such that ππ‘ π
β
β€ 1.
Goal: Maximize β« π ππΎπ subject to these
constraints.
π = ππ gets β« π ππΎπ = πΎπ (π)
Conjecture [restated]:
One can achieve β« π ππΎπ β« πΎπ (π)
as πΎπ π β 0
π β βπ
the Gaussian case
Theorem [Eldan-L 2014]:
If π βΆ βπ β β+ satisfies πΌ π = 1 and
π» 2 log π π₯ β½ βπ½ πΌπ
then for all πΌ β₯ 2:
for all π₯ β βπ
1 πΆπ½ log log πΌ
βπβ₯πΌ β€ β
πΌ
log πΌ
4
the Gaussian case
Theorem [Eldan-L 2014]:
If π βΆ βπ β β+ satisfies πΌ π = 1 and
π» 2 log π π₯ β½ βπ½ πΌπ
for all π₯ β βπ
then for all πΌ β₯ 8:
1 πΆπ½ log log πΌ
βπβ₯πΌ β€ β
πΌ
log πΌ
4
Corollary: If π βΆ βπ β β+ satisfies πΌ π = 1 then for any
π‘ < 1 and all πΌ β₯ 2,
1 πΆ log log πΌ 4
β π1βπ‘ π π΅π‘ β₯ πΌ β€
β
πΌ (1 β π‘) log πΌ
the Gaussian case
Theorem [Eldan-L 2014]:
If π βΆ βπ β β+ satisfies πΌ π = 1 and
π» 2 log π π₯ β½ βπ½ πΌπ
for all π₯ β βπ
then
all any
πΌ β₯π 2:
Fact:forFor
with πΌ π = 1,
1 πΆπ½ log
log
πΌ
1
β
π
β₯
πΌ
β€
β
2
π» log π1βπ‘ π(π₯)
πΌπ
πΌ β½ β 1 βlog
π‘ πΌ
4
Corollary: If π βΆ βπ β β+ satisfies πΌ π = 1 then for any
π‘ < 1 and all πΌ β₯ 2,
1 πΆ log log πΌ 4
β π1βπ‘ π π΅π‘ β₯ πΌ β€
β
πΌ (1 β π‘) log πΌ
some difficulties
Corollary: If π βΆ βπ β β+ satisfies πΌ π = 1 then for any
π‘ < 1 and all πΌ β₯ 2,
β π1βπ‘ π π΅π‘ β₯ πΌ β€
1
πΌ
πΆ(π‘) log log πΌ 4
β
log πΌ
What are the difficult functions π?
half space
Good:
noise insensitive
Bad:
boundary far from
origin
dust
proof ideas
πΌ π = 1,
π» 2 log π π₯ β½ βπ½ πΌπ
ππ‘ = π1βπ‘ π π΅π‘ is a (Doob) martingale
π0 = πΌ π = 1
π1 = π π΅1
Goal: β π1 > πΌ βͺ
1
πΌ
arguing about small-probability events = annoying
random measure
conditioning
ππ‘ = π΅π‘ conditioned on π΅1 βΌ πππΎπ
Goal:
β π π΅1 β πΌ, 2πΌ
βͺ 1/πΌ
Suffices to prove that:
β π π1 β πΌ, 2πΌ
= π(1)
because
β π π1 β πΌ, 2πΌ
βΌ πΌ β π π΅1 β πΌ, 2πΌ
ππ‘ as an Itô process: Föllmerβs drift
Consider a process ππ‘ with π0 = 0, and
πππ‘ = ππ΅π‘ + π£π‘ ππ‘
where π£π‘ is predictable (deterministic function of {π΅π : π β [0, π‘]})
Integrating: ππ‘ =
π‘
π΅π‘ + β«0 π£π
ππ = Brownian motion + drift
Among all such drifts satisfying π1 = π΅1 +
let π£π‘ be the one which minimizes
1
πΌ
0
π£π‘
2
ππ‘
1
β«0 π£π‘
ππ‘ βΌ π ππΎπ ,
ππ‘ as an Itô process: Föllmerβs drift
Among all such drifts satisfying π1 = π΅1 +
let π£π‘ be the one which minimizes
1
πΌ
0
π£π‘
2
ππ‘
1
β«0 π£π‘
ππ‘ βΌ π ππΎπ ,
ππ‘ as an Itô process: Föllmerβs drift
Among all such drifts satisfying π1 = π΅1 +
let π£π‘ be the one which minimizes
1
πΌ
0
π£π‘
2
1
β«0 π£π‘
ππ‘ βΌ π ππΎπ ,
ππ‘
ππ‘
Lemma: π£π‘ is martingale.
Explicit form: π£π‘ = π» log π1βπ‘ π ππ‘ =
Theorem [Lehec 2010]:
1
Ent(π) = πΌ
2
π»π1βπ‘ π ππ‘
π1βπ‘ π ππ‘
1
0
π£π‘
2
ππ‘
an optimal coupling
{π΅π‘ } π-dim Brownian motion, π βΆ βπ β β+ with πΌπ = 1
Construct ππ‘ so that π1 βΌ πππΎπ
π0 = 0
πππ‘ = ππ΅π‘ + π£π‘ ππ‘
π»π1βπ‘ π ππ‘
π£π‘ =
π1βπ‘ ππ‘
is a martingale
an optimal coupling
{π΅π‘ } π-dim Brownian motion, π βΆ βπ β β+ with πΌπ = 1
Construct ππ‘ so that π1 βΌ πππΎπ
π0 = 0
πππ‘ = ππ΅π‘ + π£π‘ ππ‘
π»π1βπ‘ π ππ‘
π£π‘ =
π1βπ‘ ππ‘
is a martingale
Multi-granular geometry of π
reflected in {π£π‘ }:
For all π‘ β 0,1 ,
πΌ π£π‘
2
2
= β« π» π1βπ‘ π ππΎπ
proof sketch
Suffices to prove that:
β π π1 β πΌ, 2πΌ
= π(1)
Idea:
Suppose that β π π1 β πΌ, 2πΌ β₯ π, then
β π π1 β 2πΌ, 4πΌ β₯ π
β π π1 β 4πΌ, 8πΌ β₯ π
β―
log πΌ
levels
Making π bigger: Weβll use π» 2 log π π₯ β½ βπ½ πΌπ
log π π1 + π’ β₯ log π π1 + β©π’, π» log π π1 βͺ βπ½ π’
= log π π1 + β©π’, π£1 βͺ β π½ π’
2
2
proof sketch
Pushing π1 in the direction of the drift at time π‘ = 1:
π π1 + π’ β₯ π π1 exp π’, π£1 β π½ π’ 2
Setting π’ = πΏπ£1 (πΏ small) multiplies the value of π.
want to say that π1 could do it
Girsanovβs theorem
Consider any Itô process πππ‘ = ππ΅π‘ + π£π‘ ππ‘
*under suitable conditions
Let π be the BM measure of π΅π‘ and π be the BM measure of ππ‘ .
Then under the change of measure:
ππ
= exp β
ππ
1
0
1
π£π‘ , ππ΅π‘ β
0
π£π‘
{ππ‘ βΆ π‘ β [0,1]} has the law of Brownian motion.
2
ππ‘
the masochistic part
πππ‘ = ππ΅π‘ + π£π‘ ππ‘
π£π‘ = π» log π1βπ‘ π ππ‘
πππ‘πΏ = ππ΅π‘ + 1 + πΏ π£π‘ ππ‘
Now we can argue that ππ‘ β ππ‘πΏ (Girsanovβs theorem).
What about π π1πΏ β« π π1 ?
Note that
π1πΏ
= π1 +
1
πΏ β«0 π£π‘
ππ‘, and recall
π π1 + π’ β₯ π π1 exp π’, π£1 β π(1)
Big question:
1
Does β«0 π£π‘ ππ‘ point in the direction of the gradient π£1 ?
the calculations and concentration
πππ‘ = ππ΅π‘ + π£π‘ ππ‘
π£π‘ = π» log π1βπ‘ π ππ‘
πππ‘πΏ = ππ΅π‘ + 1 + πΏ π£π‘ ππ‘
Let π be the BM law of π1 and ππΏ the BM law of π1πΏ .
Girsanov:
πππΏ
= exp πΏ
ππ
1
0
πΏ2
π£π‘ , ππ΅π‘ β πΏ +
2
Gradient estimate:
π π1πΏ β₯ π π1 exp
1
0
1
πΏ
0
π£π‘ ππ‘, π£1
π£π‘
2 ππ‘
the calculations and concentration
Since π£π‘ is a martingale πΌ π£1 π΅π‘ = πΌ π£π‘ π΅π‘ , so
1
πΌ πΏ
0
Girsanov:
πππΏ
= exp πΏ
ππ
1
π£π‘ ππ‘, π£1
1
0
=πΏπΌ
0
π£π‘
πΏ2
π£π‘ , ππ΅π‘ β πΏ +
2
Gradient estimate:
π π1πΏ β₯ π π1 exp
2
ππ‘
1
0
1
πΏ
0
π£π‘ ππ‘, π£1
π£π‘
2 ππ‘
the calculations and concentration
For πΏ = π/ log πΌ, π = 1,2, β¦ , log πΌ:
If β π π1 β πΌ, 2πΌ β₯ π, then
β π π1 β 2πΌ, 4πΌ β₯ π/10
β π π1 β 4πΌ, 8πΌ β₯ π/10 β¦
Girsanov:
πππΏ
= exp πΏ
ππ
1
0
πΏ2
π£π‘ , ππ΅π‘ β πΏ +
2
Gradient estimate:
π π1πΏ β₯ π π1 exp
1
0
1
πΏ
0
π£π‘ ππ‘, π£1
π£π‘
2 ππ‘
conclusion
Process for β1,1 π analogous to ππ‘ : Sample coordiantes
one by one to have the right marginals conditioned on the past.
Can prove log-Sobolev and Talagrandβs Entropy-Transport inequality
in a few lines based on this. There is an information theory
interpretation: Chain rule β Martingale property.
These proofs use first derivatives of π, while our proof of the
convolution conjecture in Gaussian spaces use second derivatives.
Prove for the discrete case? There is a difficulty because the drift
requires additional randomness.
Riemannian setting: Bakry-Emery curvature, gradient flow vs. optimal
transport
Kumar-Courtade conjecture
π β 0,1
π
uniformly at random
π βΌπ π
Conjecture: πΌ(π π βΆ π) maximized among functions
π: 0,1 π β 0,1
by π π = π1