slides

Transcript slides

A Faster Algorithm for Linear Programming
and the Maximum Flow Problem I
Yin Tat Lee
(MIT, Simons)
Joint work with Aaron Sidford
THE PROBLEM
Linear Programming
Consider the linear program (LP)
min 𝑐 𝑇 𝑥
𝐴𝑥≥𝑏
where 𝐴 is a 𝑚 × 𝑛 matrix.
• 𝑚 is the number of constraints.
• 𝑛 is the number of variables.
𝑛 = 2, 𝑚 = 6
𝑛 = 2, 𝑚 = ∞
𝑚 = # of constraints
𝑛 = # of variables
Previous Results
• All of them are iterative methods.
• Start with some initial point 𝑥.
• While 𝑥 is not optimal
– Improve 𝑥
• Time = (#𝑖𝑡𝑒𝑟) × (𝑐𝑜𝑠𝑡 𝑝𝑒𝑟 𝑖𝑡𝑒𝑟)
• This talk focus on #iter.
We call it efficient if
• Polynomial time
• Doesn’t use LP solver
𝑚 = # of constraints Previous
𝑛 = # of variables
(log(𝜀 −1 )) is omitted)
Results (Selected)
Year
Author
# of iter
Cost per iter
Efficient steps
1947
Dantzig
2𝑛
Pivot
Yes
1979
Khachiyan
𝑛2
Update the Ellipsoid
Yes
1984
Karmarkar
𝑚
Solve linear systems
Yes
1986
Renegar
Solve linear systems
Yes
1989
Vaidya
Matrix inverse
Yes
1994
Nesterov,
Nemirovskii
Compute volume
No
𝑂( 𝑛)
Solve Linear Systems
Yes
𝑂( 𝑟𝑎𝑛𝑘)
Solve Linear Systems
Yes
2013
Lee, Sidford
𝑚
𝑚𝑛
1/4
𝑛
Remark: In 2013, Mądry shows how to obtain 𝑚3/7 iters for certain LPs!
Outline
𝑚 = # of constraints
𝑛 = # of variables
(log(𝜀 −1 )) is omitted)
Year
Author
# of iter
Cost per iter
Efficient steps
1947
Dantzig
2𝑛
Pivot
Yes
1979
Khachiyan
𝑛2
Update the Ellipsoid
Yes
1984
Karmarkar
𝑚
Solve linear systems
Yes
1986
Renegar
Solve linear systems
Yes
1989
Vaidya
Matrix inverse
Yes
1994
Nesterov,
Nemirovskii
Compute volume
No
𝑂( 𝑛)
Solve Linear Systems
Yes
𝑂( 𝑟𝑎𝑛𝑘)
Solve Linear Systems
Yes
2013
Lee, Sidford
𝑚
𝑚𝑛
1/4
𝑛
Remark: In 2013, Mądry shows how to obtain 𝑚3/7 iters for certain LPs!
LP AND CENTER
A general framework
We can solve linear program by maintaining center.
Somehow, get a “center” first
Put the cost constraint there
and move it.
Say we can move 𝜃
portion closer each time
After 𝑂(𝜃 −1 ) steps, we are done.
A general framework
We can solve linear program by maintaining center.
Somehow, get a “center” first
Put the cost constraint there
and move it.
Why center?
After 𝑂(𝜃
Say we can move 𝜃
portion closer each time
−1
) steps, we are done.
What if we don’t try to maintain a center?
• It is just like simplex method.
It is good now.
Oh, it touches. What to do?
Still good.
…..
What if we don’t try to maintain a center?
• It is just like simplex method.
It is good now.
Still good.
Avoid bad decision
by using global
…..
Oh, it touches. What to do?
information!
A general framework
Formally, we have (say 𝑂𝑃𝑇 = 0):
• 𝑡 = 22014 . Find the center of 𝑐 𝑇 𝑥 ≤ 𝑡, 𝐴𝑥 ≥ 𝑏
• While 𝑡 is large
– 𝑡 ← 𝑡(1 − 𝜃) for some fixed 𝜃 > 0
– Update the center of 𝑐 𝑇 𝑥 ≤ 𝑡, 𝐴𝑥 ≥ 𝑏
This is called interior point method.
The initial point is easy:
min
𝐴𝑥 𝑖 +𝑑≥𝑏𝑖 ,𝑑≥0
22014 𝑑 + 𝑐 𝑇 𝑥
A general way to define a center
Let 𝑝 be a smooth convex function on Ω such that
• 𝑝 𝑥 → +∞ as 𝑥 → 𝜕Ω.
For example,
Standard log barrier: ps x = −
ln 𝐴𝑥 𝑖 − 𝑏𝑖 = −
ln(𝑠𝑖 )
Center = argmin 𝑝 𝑥
Barrier
Function
QUALITY OF A CENTER
Rounding
• Assume center is induced by some barrier function 𝑝.
• Look at the ellipsoid 𝐸 induced by 𝑝 at the center 𝑥.
• Call 𝐸 is 𝜆 rounding if 𝑠𝐸 ⊂ Ω ⊂ 𝜆𝑠𝐸 for some 𝑠.
Self concordant barrier
• 𝑝 is a 𝜆-self concordant barrier function for Ω if
– 𝑝 is smooth.
– 𝑝 gives 𝜆 rounding.
𝑝 is not smooth enough
Bad rounding.
Rounding Algorithm
For general barrier function 𝑝:
• Repeat
– Tighten the cost constraint
– Maintain the rounding ellipsoid induced by 𝑝.
Why 𝜆 iterations?
Why 𝜆 iterations?
Think 𝑝 𝑥 = − ln 𝑠𝑖 .
• Newton Method (Using smoothness)
Given 𝑝 𝑦 − min 𝑝 (𝑥) < 0.5, we can find the center in 𝑂(1)
steps.
x
Why 𝜆 iterations?
Let 𝑦 be the old center. Using the smoothness, we have
𝑡𝑛𝑒𝑤 − 𝑐 𝑇 𝑦 1
𝑝𝑛𝑒𝑤 𝑦 − min 𝑝𝑛𝑒𝑤 𝑥 ≤
≤ .
𝑇
x
𝑡−𝑐 𝑦
2
Why 𝜆 iterations?
So, we need
𝑡𝑛𝑒𝑤 − 𝑐 𝑇 𝑦 1
≤ .
𝑇
𝑡−𝑐 𝑦
2
It takes 𝑂 𝜆 iters.
Why
𝜆 iterations?
• We can reduce the gap by 1/ 𝜆.
Roughly Speaking:
Smoothness + 𝜆 rounding gives
𝜆log(𝜖 −1 ) iterations LP solvers.
Quality of analytic center is arbitrary bad in 𝒏!
• Recall the standard log barrier function
𝑚
𝑝𝑠 𝑥 = −
𝑚
ln 𝐴𝑥 𝑖 − 𝑏𝑖 = −
𝑖=1
ln 𝑠𝑖 .
𝑖=1
• The center 𝑥 = argmin𝑦 𝑝 𝑦 is called analytic center.
Is it tight?
• In practice, it takes ≤ 60 steps.
• Mizuno, Todd, Ye showed it is “usually” correct on first step.
• In 2014, Mut and Terlaky showed an example really takes
Ω 𝑚 log 𝜖 −1 iterations where 𝑚 is exponential in 𝑛.
UNIVERSAL BARRIER FUNCTION
Universal Barrier Function
Theorem [NN94]: For any convex set Ω ∈ 𝑅𝑛 ,
𝑝 𝑥 = −log 𝑣𝑜𝑙( Ω − 𝑥 𝑜 )
is a 𝑂(𝑛)-self concordant barrier function.
“Smaller” set has larger polar. Hence, 𝑝 → ∞ as 𝑥 → 𝜕Ω
Note that 𝛻 2 𝑝 ~ 𝑆𝑒𝑐𝑜𝑛𝑑 𝑀𝑜𝑚𝑒𝑛𝑡 𝑜𝑓 Ω − 𝑥 𝑜 .
Kannan-Lovasz-Simonovits Lemma: For any convex set Ω, the
second moment matrix
𝑥𝑥 𝑇
𝑀(Ω) =
gives a 𝑂(𝑛) rounding of Ω.
Ω
The cost of Universal Barrier
• To get second moment matrix, you need 𝑛𝑂 1 sampling.
• To get 1 sampling, you need to do 𝑛𝑂(1) iters of Markov chain.
• To do 1 iter of Markov chain, you need to implement
separation oracle for Ω − 𝑥 𝑜 .
• If Ω = {𝐴𝑥 ≥ 𝑏}, one need to solve an LP.
Hence, one iteration requires solving 𝑛𝑂(1) many LPs.
The problem:
Get an efficient 𝑂(𝑛) self concordant barrier
function.
VOLUMETRIC BARRIER
FUNCTION
Volumetric Barrier Function
In 1989, Vaidya showed
1
𝑛
2
𝑝𝑣 𝑥 = log det 𝛻 𝑝𝑠 (𝑥) + 𝑝𝑠 (𝑥)
2
𝑚
where 𝑝𝑠 𝑥 = − 𝑖 ln 𝑠𝑖 . Why it is volumetric?
For example:
It is a 𝑚𝑛
1/2
barrier.
Volumetric Barrier
Log Barrier
Why Volumetric is good?
1
𝑛
2
𝑝𝑣 𝑥 = log det 𝛻 𝑝𝑠 (𝑥) + 𝑝𝑠 (𝑥)
2
𝑚
Around 𝑦, we have
𝜎𝑖 𝑆𝑦−1 𝐴 +
𝑝𝑣 𝑥 ~ −
𝑖
where
𝜎𝑖 𝐵 = 𝑒𝑖𝑇 𝐵 𝐵𝑇 𝐵
𝑛
log𝑠𝑖 (𝑥)
𝑚
−1 𝐵𝑒 .
𝑖
1 0
2 + 32
1
0 . 𝜎 = 1 , 𝜎 = 9 , 𝜎 = 1.
𝑇
Example: 𝐵 = 3 0 . Then, 𝐵 𝐵 =
1
10 2
10 3
0
22
0 2
In general, 𝜎𝑖 = 𝑛, 0 ≤ 𝜎𝑖 ≤ 1, if the 𝑖 𝑡ℎ row is repeated, 𝜎𝑖 is decreased by 2.
For [0,1] interval with 0 repeated 𝑘 times:
1
−1
𝑆 𝐴= 1
⋮
𝑘1/3
𝑆 −1 𝐴 =
1
⋮
OUR BARRIER FUNCTION
Repeated Volumetric Barrier Function
1
𝑛
2
𝑥 = log det 𝛻 𝑝𝑠 (𝑥) + 𝑝𝑠 (𝑥)
2
𝑚
1
𝑛 (𝑘)
2
(𝑘)
log det 𝛻 𝑝 (𝑥) + 𝑝 (𝑥)?
2
𝑚
𝑝(1)
How about 𝑝(𝑘+1) 𝑥 =
Suppose 𝑝(𝑘) 𝑥 = −
𝑝(𝑘+1) 𝑥 ~ −
So, we have
(𝑘+1)
𝑤𝑖
(𝑘)
𝑖 𝜔𝑖
log𝑠𝑖 , around 𝑦, we have
𝑛
−1
(𝑘)
𝜎𝑖 𝑊 𝑆𝑦 𝐴 +
log𝑠𝑖 𝑥 .
𝑚
𝑖
What
is that?
= 𝜎𝑖
We call 𝑝(∞) 𝑥 = − 𝑤𝑖
𝑤𝑖
∞
∞
(𝑘)
𝑊 (𝑘) 𝑆𝑦−1 𝐴 + 𝑤𝑖 .
(𝑥) log 𝑠𝑖 where 𝑤𝑖
(𝑥) = 𝜎𝑖
∞
𝑊 (∞) 𝑆𝑥−1 𝐴 .
satisfies
What is that weight?
1
2
• Let 𝜏𝑖 = 𝜎𝑖 (𝑊 𝐴)/𝑤𝑖 .
𝑤𝑖
∞
(𝑥) = 𝜎𝑖
𝑊 (∞) 𝑆𝑥−1 𝐴 .
If 𝜏𝑖 ≤ 1 for all 𝑖,
the ellipsoid is inside.
(∞)
The 𝑤𝑖
represents John
ellipsoid of { 𝑆𝑥−1 𝐴 ∞ ≤ 1}
Our Condition
(John Ellipsoid):
𝜏𝑖 = 1 if 𝑤𝑖 ≠ 0.
Repeated Volumetric Barrier Function
• Recall
𝑝(∞) 𝑥 ~ ln det 𝛻 2 𝑝
We get
∞
𝑥
~ − ln det 𝐴𝑆𝑥−1 𝑊 (∞) 𝑆𝑥−1 𝐴
𝑝(∞) ~ − ln vol 𝐽𝑜ℎ𝑛𝐸𝑙𝑙𝑖𝑝𝑠𝑜𝑖𝑑 Ω ∩ (2x − Ω) .
Symmetrize
Find John Ellipsoid
The barrier function is not perfect!
• The path is piecewise smooth because it may not touch every
constraint.
• 𝑝(∞) =
max
𝑤=𝑛,𝑤≥0
lndet(𝐴𝑇 𝑆 −1 𝑊𝑆 −1 𝐴)
Our Barrier Function
• Standard Log Barrier:
𝑝𝑠 𝑥 = − 𝑙𝑛𝑠𝑖
• Volumetric Barrier:
1
𝑛
2
𝑝𝑣 𝑥 = log det 𝛻 𝑝𝑠 (𝑥) + 𝑝𝑠 (𝑥)
2
𝑚
• John Ellipsoid Barrier:
𝑝 ∞ (𝑥) = max lndet(𝐴𝑇 𝑆 −1 𝑊𝑆 −1 𝐴)
𝑤=𝑛,𝑤≥0
• Regularized John Ellipsoid Barrier (1):
−1 (𝑚)
1−log
𝑛 𝑆 −1 𝐴)
max lndet(𝐴𝑇 𝑆 −1 𝑊
𝑤≥0
𝑛
+
𝑙𝑛𝑤𝑖 − 𝑤𝑖
𝑚
• Regularized John Ellipsoid Barrier (2):
𝑛
𝑛
𝑇
−1
−1
max ln det 𝐴 𝑆 𝑊𝑆 𝐴 −
𝑤𝑖 ln𝑤𝑖 −
ln𝑠𝑖
𝑤≥0
𝑚
𝑚
𝒍𝒑 Lewis Weight
We call 𝑤
is 𝑙 𝑝
𝑇 −1
Lewis weight for 𝐵 if
max lndet(𝐴 𝑆
𝑤≥0
𝑚
1−log−1 ( ) −1
𝑛 𝑆 𝐴) .
𝑊
1 1
−𝑝
2
(𝑊
𝐵)
𝑤𝑖 = 𝜎𝑖
Thanks to Cohen and Peng, we know
• Let 𝐶 be
𝑝
max 1, 2
𝑂(𝑑
) rows sample of 𝐵 accordingly to 𝑤𝑖 ,
𝐶𝑧 𝑝 ~ 𝐵𝑧 𝑝 ∀𝑧.
2
1−𝑝
• 𝑄 = 𝐵𝑇 𝑊 𝐵 is the maximizer of
−𝑙𝑜𝑔 det 𝑄 𝑠𝑢𝑏𝑗𝑒𝑐𝑡 𝑡𝑜 𝔼𝑥 𝑇 𝑄𝑥≤1 𝐵𝑥
𝑝
𝑝
≤1
i.e, the maximum ellipsoid such that it “insides” the polytopes.
• For 𝑝 = ∞, {𝑥 𝑇 𝑄𝑥 ≤ 1} is the John ellipsoid for {|𝐵𝑥| ≤ 1}.
Computing 𝒍𝒑 Lewis Weight
• Cohen and Peng showed how to compute it when 𝑝 < 4.
• The repeated
(1)
volumetric barrier: 𝑤𝑖 = 𝑛/𝑚,
(𝑘+1)
𝑘
𝑤𝑖
= 𝜎𝑖 𝑊 (𝑘) 𝐵 + 𝑤𝑖 .
(log(𝑚))
After renormalization, 𝑤𝑖
gives “𝑙∞ “Lewis” weight:
1
~𝜎𝑖 (𝑊 2 𝐵).
𝑤𝑖 𝑥
• Cohen, L., Peng, Sidford shows that in fact a similar algorithm
find constant “approximate” 𝑙𝑝 Lewis weight for 𝑝 > 2 in 𝑂(1).
CONCLUSION
𝑚 = # of constraints
𝑛 = # of variables
Our Barrier
Given any polytope {Ax ≥ 𝑏}, let
𝑛
𝑛
𝑝 𝑥 =
−
𝑤𝑖 ln𝑤𝑖 −
ln𝑠𝑖 .
𝑚
𝑚
Theorem: The barrier function 𝑝 gives 𝑂( 𝑛 log 𝜖 −1 ) iterations
algorithm for LP of the form
min 𝑐 𝑇 𝑥 .
max ln det 𝐴𝑇 𝑆 −1 𝑊𝑆 −1 𝐴
𝑤≥0
𝐴𝑥≥𝑏
Algorithm:
• While
– Move the cost constraint
– Maintain the regularized John ellipsoid
𝑚 = # of constraints
𝑛 = # of variables
However…
• My goal is
to design general LP algo fast enough to
beat the best maxflow algorithm!
• We obtained
min 𝑐 𝑇 𝑥
𝐴𝑥≥𝑏
Compute 𝐴𝑇 𝐷𝑘 𝐴
𝑂
𝑛 log 𝜖 −1
−1