Download Here

Transcript Download Here

Difference of Convex (DC)
Decomposition of
Nonconvex Polynomials with
Algebraic Techniques
Georgina Hall
Princeton, ORFE
Joint work with
Amir Ali Ahmadi
Princeton, ORFE
7/13/2015
MOPTA 2015
1
DC Decomposition of Nonconvex Polynomials with Algebraic Techniques
Difference of Convex (DC) programming
• Problems of the form
min 𝑓0 (𝑥)
𝑠. 𝑡. 𝑓𝑖 𝑥 ≤ 0
where:
• 𝑓𝑖 𝑥 ≔ 𝑔𝑖 𝑥 − ℎ𝑖 𝑥 , 𝑖 = 0, … , 𝑚,
• 𝑔𝑖 : ℝ𝑛 → ℝ, ℎ𝑖 : ℝ𝑛 → ℝ are convex.
2
DC Decomposition of Nonconvex Polynomials with Algebraic Techniques
Concave-Convex Computational Procedure (CCCP)
• Heuristic for minimizing DC programming problems.
• Has been used extensively in:
• machine learning (sparse support vector machines (SVM), transductive SVMs,
sparse principal component analysis)
• statistical physics (minimizing Bethe and Kikuchi free energies).
• Idea:
Input
𝑘≔0
x𝑥0 , initial point
𝑓𝑖 = 𝑔𝑖 − ℎ𝑖 ,
𝑖 = 0, … , 𝑚
Convexify by linearizing 𝒉
x𝒇𝒌𝒊 𝒙 = 𝑔𝑖 𝑥 − (ℎ𝑖 𝑥𝑘 + 𝛻ℎ𝑖 𝑥𝑘
convex convex
𝒇𝒌𝒊 𝒙
𝒇𝒊 (𝒙)
𝑇
Solve convex subproblem
𝑥 − 𝑥𝑘 )
affine
𝑘 ≔𝑘+1
Take 𝑥𝑘+1 to be the solution of
min 𝑓0𝑘 𝑥
𝑠. 𝑡. 𝑓𝑖𝑘 𝑥 ≤ 0, 𝑖 = 1, … , 𝑚
3
DC Decomposition of Nonconvex Polynomials with Algebraic Techniques
Concave-Convex Computational Procedure (CCCP)
• Toy example: min 𝑓 𝑥 , where 𝑓 𝑥 ≔ 𝑔 𝑥 − ℎ(𝑥)
𝑥
Convexify 𝑓 𝑥 to
obtain 𝑓 0 (𝑥)
Initial point: 𝑥0 = 2
Minimize 𝑓 0 (𝑥) and
obtain 𝑥1
Reiterate
𝑥𝑥𝑥∞43𝑥𝑥2 1
𝑥0
4
DC Decomposition of Nonconvex Polynomials with Algebraic Techniques
CCCP for nonconvex polynomial optimization problems (1/2)
CCCP relies on input functions being given as a difference of convex
functions.
What if we don’t have access to such a decomposition?
We will consider polynomials in 𝑛 variables and of degree 𝑑.
• Any polynomial can be written as a difference of convex polynomials.
• Proof by Wang, Schwing and Urtasun
• Alternative proof given later in this presentation, as corollary of stronger
theorem
5
DC Decomposition of Nonconvex Polynomials with Algebraic Techniques
CCCP for nonconvex polynomial optimization problems (2/2)
𝑓 𝑥 = 𝑔 𝑥 − ℎ(𝑥)
• In fact, for any polynomial, ∃ an infinite number of decompositions.
Example
x𝑓(𝑥) = 𝑥 4 − 3𝑥 2 + 2𝑥 − 2
Possible decompositions
𝑔 𝑥 = 𝑥4,
ℎ 𝑥 = 3𝑥 2 − 2𝑥 + 2
𝑔 𝑥 = 𝑥 4 + 𝒙𝟐 ,
ℎ 𝑥 = 3𝑥 2 + 𝒙𝟐 − 2𝑥 + 2,
Which one would be a natural choice for CCCP?
etc.
6
DC Decomposition of Nonconvex Polynomials with Algebraic Techniques
Picking the “best” decomposition (1/2)
Algorithm
Linearize 𝒉 𝒙 around a point 𝑥𝑘 to obtain convexified version of 𝒇(𝒙)
Idea
Pick ℎ 𝑥 such that it is as close as possible to affine
Mathematical translation
Minimize curvature of ℎ (𝐻ℎ is the hessian of ℎ)
At a point 𝒂
Over a region 𝛀
min 𝜆𝑚𝑎𝑥 (𝐻ℎ 𝑎 )
min max 𝜆𝑚𝑎𝑥 (𝐻ℎ 𝑥 )
s.t. 𝑓 = 𝑔 − ℎ
𝑔, ℎ convex
s.t. 𝑓 = 𝑔 − ℎ,
𝑔, ℎ convex
g,h
𝑔,ℎ
𝑥∈Ω
7
DC Decomposition of Nonconvex Polynomials with Algebraic Techniques
Picking the “best” decomposition (2/2)
Theorem: Finding the “best” decomposition of a degree-4 polynomial
over a box is NP-hard.
Proof idea: Reduction via testing convexity of quartic polynomials is
hard (Ahmadi, Olshevsky, Parrilo, Tsitsiklis).
The same is likely to hold for the point version, but we have been unable
to prove it.
How can we efficiently find such a decomposition?
8
DC Decomposition of Nonconvex Polynomials with Algebraic Techniques
Convex relaxations for DC decompositions (1/6)
SOS, DSOS, SDSOS polynomials (Ahmadi, Majumdar)
• Families of nonnegative polynomials.
Type
Characterization
Testing
membership
Sum of squares (sos)
∃𝑞𝑖 , polynomials, s.t. 𝑝(𝑥) = ∑𝑞𝑖2 (𝑥)
SDP
2
⇓
Scaled diagonally dominant
sum of squares (sdsos)
p= ∑𝑖 𝛼𝑖 𝑚𝑖2 + ∑𝑖,𝑗 𝛽𝑖+ 𝑚𝑖 + 𝛾𝑗+ 𝑚𝑗 + 𝛽𝑖− 𝑚𝑖 − 𝛾𝑗− 𝑚𝑗
𝑚𝑖 , 𝑚𝑗 monomials, 𝛼𝑖 ≥ 0
Diagonally dominant
sum of squares (dsos)
+
−
p= ∑𝑖 𝛼𝑖 𝑚𝑖2 + ∑𝑖,𝑗 𝛽𝑖,𝑗
𝑚𝑖 + 𝑚𝑗 + 𝛽𝑖,𝑗
𝑚𝑖 − 𝑚𝑗
+,−
𝑚𝑖 , 𝑚𝑗 monomials, 𝛼𝑖 , 𝛽𝑖𝑗
≥0
2
2
SOCP
2
LP
9
⇓
DC Decomposition of Nonconvex Polynomials with Algebraic Techniques
Convex relaxations for DC decompositions (2/6)
DSOS-convex, SDSOS-convex, SOS-convex polynomials
𝑝(𝑥)
convex
⇔ 𝐻𝑝 𝑥 ≽ 0, ∀𝑥 ⇔
𝑦 𝑇 𝐻𝑝 𝑥 𝑦 ≥ 0,
∀𝑥, 𝑦 ∈ ℝ𝑛
⇐
𝑦 𝑇 𝐻𝑝 𝑥 𝑦
sos/sdsos/dsos
Definitions:
• 𝑝 is dsos-convex if 𝑦 𝑇 𝐻𝑝 𝑥 𝑦 is dsos. LP
• 𝑝 is sdsos-convex if 𝑦 𝑇 𝐻𝑝 𝑥 𝑦 is sdsos. SOCP
• 𝑝 is sos-convex if 𝑦 𝑇 𝐻𝑝 𝑥 𝑦 is sos. SDP
10
DC Decomposition of Nonconvex Polynomials with Algebraic Techniques
Convex relaxations for DC decompositions (3/6)
Comparison of these sets on a parametric family of polynomials:
𝑝 𝑥1 , 𝑥2 = 2𝑥14 + 2𝑥24 + 𝑎𝑥13 𝑥2 + 𝑏𝑥12 𝑥22 + 𝑐𝑥1 𝑥23
𝑐 = −0.5
𝑐=0
𝑐=1
𝑏
𝑏
𝑏
𝑎
dsos-convex
𝑎
sdsos-convex
𝑎
sos-convex=convex
11
DC Decomposition of Nonconvex Polynomials with Algebraic Techniques
Convex relaxations for DC decompositions (4/6)
How to use these concepts to do DC decomposition at a point 𝑎?
Original problem
min 𝜆𝑚𝑎𝑥 (𝐻ℎ 𝑎 )
s.t. 𝑓 = 𝑔 − ℎ
𝑔, ℎ convex
⇔
Relaxation 1:
sos-convex
min 𝑡
s.t. 𝐻ℎ 𝑎 ≼ 𝑡𝐼
𝑓 =𝑔−ℎ
𝑔, ℎ sos-convex
Relaxation 2:
sdsos-convex
min 𝑡
s.t. 𝐻ℎ 𝑎 ≼ 𝑡𝐼
𝑓 =𝑔−ℎ
𝑔, ℎ sdsos-convex
Relaxation 3:
dsos-convex
min 𝑡
s.t. 𝐻ℎ 𝑎 ≼ 𝑡𝐼
𝑓 =𝑔−ℎ
𝑔, ℎ dsos-convex
SDP
SOCP + “small” SDP
LP + “small” SDP
min 𝑡
s.t. 𝐻ℎ 𝑎 ≼ 𝑡𝐼
𝑓 =𝑔−ℎ
𝑔, ℎ convex
Relaxation 4:
sdsos-convex+sdd
min 𝑡
s.t. 𝒕𝑰 − 𝑯𝒉 𝒂 sdd (**)
𝑓 =𝑔−ℎ
𝑔, ℎ, sdsos-convex
SOCP
Relaxation 5:
dsos-convex + dd
min 𝑡
s.t. 𝒕𝑰 − 𝑯𝒉 𝒂 dd (*)
𝑓 =𝑔−ℎ
𝑔, ℎ, dsos-convex
LP
∗ 𝑄 is diagonally dominant (dd) ⇔ ∑𝑗 𝑞𝑖𝑗 < 𝑞𝑖𝑖 , ∀𝑖
∗∗ 𝑄 is sdd ⇔ ∃𝐷 > 0 diagonal, s.t. 𝐷𝑄𝐷 dd.
12
DC Decomposition of Nonconvex Polynomials with Algebraic Techniques
Convex relaxations for DC decompositions (5/6)
Can any polynomial be written as the difference of two dsos/sdsos/sos
convex polynomials?
Lemma about cones: Let 𝐾 ⊆ 𝐸 a full dimensional cone (𝐸, any vector
space). Then any 𝑣 ∈ 𝐸 can be written as 𝑣 = 𝑘1 − 𝑘2 , 𝑘1 , 𝑘2 ∈ 𝐾.
=: 𝑘′
Proof sketch:
K
E
𝒌
𝒗
𝒌′
∃ 𝛼 < 1 such that 1 − 𝛼 𝑣 + 𝛼𝑘 ∈ 𝐾
1
𝛼
′
⇔𝑣=
𝑘 −
𝑘
1−𝛼
1−𝛼
𝑘1 ∈ 𝐾
𝑘2 ∈ 𝐾
13
DC Decomposition of Nonconvex Polynomials with Algebraic Techniques
Convex relaxations for DC decompositions (6/6)
Theorem: Any polynomial can be written as the difference of two dsosconvex polynomials.
Corollary: Same holds for sdsos-convex, sos-convex and convex.
Proof idea:
• Need to show that dsos-convex polynomials is full-dimensional cone.
• “Obvious” choices (i.e., 𝑝 𝑥 =
𝑑/2
2
(∑𝑖 𝑥𝑖 ) )
Induction on 𝑛: for 𝑛 = 2, take
𝑝 𝑥1 , 𝑥2 = 𝑎0 𝑥1𝑑 + 𝑎1 𝑥1𝑑−2 𝑥22 + ⋯ +
2 𝑑−2
𝑑
𝑎0 >
+
𝑎𝑑
𝑑(𝑑 − 1) 4(𝑑 − 1) 4
𝑎1 = 1
do not work.
𝑑 𝑑
𝑎𝑑 𝑥12 𝑥22
4
𝑎𝑘+1
+ ⋯ + 𝑎1 𝑥12 𝑥2𝑑−2 + 𝑎0 𝑥2𝑑
𝑑 − 2𝑘
𝑑
=
𝑎 , 𝑘 = 1, … , − 1
2𝑘 + 2 𝑘
4
14
DC Decomposition of Nonconvex Polynomials with Algebraic Techniques
Comparing the different relaxations (1/4)
• Impact of relaxations on solving
for random 𝑓 (𝑑 = 4).
Type of relaxation
min 𝑡
𝑡,𝑔,ℎ
s.t. 𝑡𝐼 − 𝐻ℎ 𝑎 psd/sdd/dd
𝑓 = 𝑔 − ℎ,
𝑔, ℎ s/d/sos-convex
𝒏=𝟔
𝒏 = 𝟏𝟎
𝒏 = 𝟏𝟔
Time (s)
Opt value
Time (s)
Opt Value
Time (s)
Opt value
dsos-convex + dd
1.05
17578.54
2.79
21191.55
20.80
168327.89
dsos-convex + psd
1.19
15855.77
3.19
19426.13
25.36
146847.73
sdsos-convex + sdd
1.21
1089.41
5.17
1962.64
34.66
7936.57
sdsos-convex + psd
1.21
1069.79
5.29
1957.03
39.43
7935.72
sos-convex + psd MOSEK
2.02
193.07
93.74
317.63
+∞
------------------
sos-convex + psd SEDUMI
11.48
193.06
10324.12
317.63
+∞
------------------
Computer:
8Gb RAM,
2.40GHz
processor
15
DC Decomposition of Nonconvex Polynomials with Algebraic Techniques
Comparing the different relaxations (2/4)
• Iterative decomposition algorithm implemented for unconstrained 𝑓.
Decompose 𝒇 = 𝒈 − 𝒉,
using one of the
relaxations at point 𝑥𝑘
Minimize convexified 𝒇𝒌 , using an SDP
subroutine [Lasserre; de Klerk and
Laurent]
DSOS PSD SDSOS SDD SDSOS PSD
SOS PSD
0
DSOS DD
-50000
-100000
-150000
-200000
•
•
•
•
•
•
Value of the objective after 3 mins.
Algorithm given above.
5 different relaxations used
𝑓 random with 𝑛 = 9, 𝑑 = 4
Average over 25 iterations
Solver: Mosek
-250000
16
DC Decomposition of Nonconvex Polynomials with Algebraic Techniques
Comparing the different relaxations (3/4)
• Constrained case: min 𝑓(𝑥) , where 𝐵 = 𝑥 ∑𝑖 𝑥𝑖2 ≤ 𝑅2 }.
𝑥∈𝐵
Single decomposition
vs
Iterative decomposition
vs
One min-max decomp.
Relaxation: min 𝑡
Decompose
s.t. 𝐻ℎ 𝑎𝒇 =
≼𝒈
𝑡𝐼− 𝒉,
once
𝑓
= 𝑔at−𝑥ℎ0
𝑔, ℎ sdsos convex
Relaxation: min 𝑡
Decompose
s.t. 𝐻ℎ 𝑎𝒇 =
≼𝒈
𝑡𝐼− 𝒉,
at𝑓 a=point
𝑔−𝑥
ℎ𝑘
𝑔, ℎ sdsos convex
Decompose
𝒇 =to
𝒈 use?
−𝒉
What
relaxation
over B
Minimize convexified 𝒇𝒌
Minimize convexified 𝒇𝒌
Minimize convexified 𝒇𝒌
Second
relaxation:
Equivalent
formulation:
First relaxation:
Original
problem:
min
min
𝑡𝑡
𝑡,𝑔,ℎ
min max𝑡,𝑔,ℎ
𝜆𝑚𝑎𝑥𝟐 (𝐻ℎ 𝑥𝟐 )
𝑔,ℎ
𝑥∈Ω
𝒕𝑰 𝒙
−
𝑯
𝒙
≽
𝒉
𝑥∈ 𝑩
𝐵 ⇒ 𝒕𝑰
𝑡𝐼 −𝑹𝑯
𝐻ℎ𝒉−𝑥𝒙∑𝒙≽𝒊 0𝟎𝝉(𝒙)
s.t.
= 𝑔 − ℎ,
𝒚𝑓𝑻𝑓𝝉(𝒙)𝒚
= 𝑔 − sos
ℎ
𝑔,𝑓 ℎ=convex
𝑔−ℎ
ℎ convex
𝒈, 𝒉𝑔,sdsos-convex
𝑔, ℎ sdsos-convex
17
DC Decomposition of Nonconvex Polynomials with Algebraic Techniques
Comparing the different relaxations (4/4)
• Constrained case: single decomposition vs. iterative decomposition
vs. min-max decomposition
4000
2000
0
-2000
-4000
-6000
-8000
-10000
-12000
Single
decomp
Iter decomp
Min max
•
•
•
•
Value of the objective after 3 mins.
Algorithms described above.
𝑓 random with 𝑛 = 10, 𝑑 = 4
Radius 𝑅 random integer between
100 and 400.
• Average over 200 iterations
-14000
-16000
18
DC Decomposition of Nonconvex Polynomials with Algebraic Techniques
Main messages
• To apply CCCP to polynomial optimization, a DC decomposition is
needed. Choice of decomposition impacts convergence speed.
• Not computationally tractable to find “best” decomposition.
• Efficient convex relaxations based on the concepts of dsos-convex
(LP), sdsos-convex (SOCP), and sos-convex (SDP) polynomials.
• Dsos-convex and sdsos-convex scale to a larger number of variables.
19
Thank you for listening
Questions?
20

Download Here

Transcript Download Here

Directory