Download Here

Download Report

Transcript Download Here

Difference of Convex (DC)
Decomposition of
Nonconvex Polynomials with
Algebraic Techniques
Georgina Hall
Princeton, ORFE
Joint work with
Amir Ali Ahmadi
Princeton, ORFE
7/13/2015
MOPTA 2015
1
DC Decomposition of Nonconvex Polynomials with Algebraic Techniques
Difference of Convex (DC) programming
β€’ Problems of the form
min 𝑓0 (π‘₯)
𝑠. 𝑑. 𝑓𝑖 π‘₯ ≀ 0
where:
β€’ 𝑓𝑖 π‘₯ ≔ 𝑔𝑖 π‘₯ βˆ’ β„Žπ‘– π‘₯ , 𝑖 = 0, … , π‘š,
β€’ 𝑔𝑖 : ℝ𝑛 β†’ ℝ, β„Žπ‘– : ℝ𝑛 β†’ ℝ are convex.
2
DC Decomposition of Nonconvex Polynomials with Algebraic Techniques
Concave-Convex Computational Procedure (CCCP)
β€’ Heuristic for minimizing DC programming problems.
β€’ Has been used extensively in:
β€’ machine learning (sparse support vector machines (SVM), transductive SVMs,
sparse principal component analysis)
β€’ statistical physics (minimizing Bethe and Kikuchi free energies).
β€’ Idea:
Input
π‘˜β‰”0
xπ‘₯0 , initial point
𝑓𝑖 = 𝑔𝑖 βˆ’ β„Žπ‘– ,
𝑖 = 0, … , π‘š
Convexify by linearizing 𝒉
xπ’‡π’Œπ’Š 𝒙 = 𝑔𝑖 π‘₯ βˆ’ (β„Žπ‘– π‘₯π‘˜ + π›»β„Žπ‘– π‘₯π‘˜
convex convex
π’‡π’Œπ’Š 𝒙
π’‡π’Š (𝒙)
𝑇
Solve convex subproblem
π‘₯ βˆ’ π‘₯π‘˜ )
affine
π‘˜ β‰”π‘˜+1
Take π‘₯π‘˜+1 to be the solution of
min 𝑓0π‘˜ π‘₯
𝑠. 𝑑. π‘“π‘–π‘˜ π‘₯ ≀ 0, 𝑖 = 1, … , π‘š
3
DC Decomposition of Nonconvex Polynomials with Algebraic Techniques
Concave-Convex Computational Procedure (CCCP)
β€’ Toy example: min 𝑓 π‘₯ , where 𝑓 π‘₯ ≔ 𝑔 π‘₯ βˆ’ β„Ž(π‘₯)
π‘₯
Convexify 𝑓 π‘₯ to
obtain 𝑓 0 (π‘₯)
Initial point: π‘₯0 = 2
Minimize 𝑓 0 (π‘₯) and
obtain π‘₯1
Reiterate
π‘₯π‘₯π‘₯∞43π‘₯π‘₯2 1
π‘₯0
4
DC Decomposition of Nonconvex Polynomials with Algebraic Techniques
CCCP for nonconvex polynomial optimization problems (1/2)
CCCP relies on input functions being given as a difference of convex
functions.
What if we don’t have access to such a decomposition?
We will consider polynomials in 𝑛 variables and of degree 𝑑.
β€’ Any polynomial can be written as a difference of convex polynomials.
β€’ Proof by Wang, Schwing and Urtasun
β€’ Alternative proof given later in this presentation, as corollary of stronger
theorem
5
DC Decomposition of Nonconvex Polynomials with Algebraic Techniques
CCCP for nonconvex polynomial optimization problems (2/2)
𝑓 π‘₯ = 𝑔 π‘₯ βˆ’ β„Ž(π‘₯)
β€’ In fact, for any polynomial, βˆƒ an infinite number of decompositions.
Example
x𝑓(π‘₯) = π‘₯ 4 βˆ’ 3π‘₯ 2 + 2π‘₯ βˆ’ 2
Possible decompositions
𝑔 π‘₯ = π‘₯4,
β„Ž π‘₯ = 3π‘₯ 2 βˆ’ 2π‘₯ + 2
𝑔 π‘₯ = π‘₯ 4 + π’™πŸ ,
β„Ž π‘₯ = 3π‘₯ 2 + π’™πŸ βˆ’ 2π‘₯ + 2,
Which one would be a natural choice for CCCP?
etc.
6
DC Decomposition of Nonconvex Polynomials with Algebraic Techniques
Picking the β€œbest” decomposition (1/2)
Algorithm
Linearize 𝒉 𝒙 around a point π‘₯π‘˜ to obtain convexified version of 𝒇(𝒙)
Idea
Pick β„Ž π‘₯ such that it is as close as possible to affine
Mathematical translation
Minimize curvature of β„Ž (π»β„Ž is the hessian of β„Ž)
At a point 𝒂
Over a region 𝛀
min πœ†π‘šπ‘Žπ‘₯ (π»β„Ž π‘Ž )
min max πœ†π‘šπ‘Žπ‘₯ (π»β„Ž π‘₯ )
s.t. 𝑓 = 𝑔 βˆ’ β„Ž
𝑔, β„Ž convex
s.t. 𝑓 = 𝑔 βˆ’ β„Ž,
𝑔, β„Ž convex
g,h
𝑔,β„Ž
π‘₯∈Ω
7
DC Decomposition of Nonconvex Polynomials with Algebraic Techniques
Picking the β€œbest” decomposition (2/2)
Theorem: Finding the β€œbest” decomposition of a degree-4 polynomial
over a box is NP-hard.
Proof idea: Reduction via testing convexity of quartic polynomials is
hard (Ahmadi, Olshevsky, Parrilo, Tsitsiklis).
The same is likely to hold for the point version, but we have been unable
to prove it.
How can we efficiently find such a decomposition?
8
DC Decomposition of Nonconvex Polynomials with Algebraic Techniques
Convex relaxations for DC decompositions (1/6)
SOS, DSOS, SDSOS polynomials (Ahmadi, Majumdar)
β€’ Families of nonnegative polynomials.
Type
Characterization
Testing
membership
Sum of squares (sos)
βˆƒπ‘žπ‘– , polynomials, s.t. 𝑝(π‘₯) = βˆ‘π‘žπ‘–2 (π‘₯)
SDP
2
⇓
Scaled diagonally dominant
sum of squares (sdsos)
p= βˆ‘π‘– 𝛼𝑖 π‘šπ‘–2 + βˆ‘π‘–,𝑗 𝛽𝑖+ π‘šπ‘– + 𝛾𝑗+ π‘šπ‘— + π›½π‘–βˆ’ π‘šπ‘– βˆ’ π›Ύπ‘—βˆ’ π‘šπ‘—
π‘šπ‘– , π‘šπ‘— monomials, 𝛼𝑖 β‰₯ 0
Diagonally dominant
sum of squares (dsos)
+
βˆ’
p= βˆ‘π‘– 𝛼𝑖 π‘šπ‘–2 + βˆ‘π‘–,𝑗 𝛽𝑖,𝑗
π‘šπ‘– + π‘šπ‘— + 𝛽𝑖,𝑗
π‘šπ‘– βˆ’ π‘šπ‘—
+,βˆ’
π‘šπ‘– , π‘šπ‘— monomials, 𝛼𝑖 , 𝛽𝑖𝑗
β‰₯0
2
2
SOCP
2
LP
9
⇓
DC Decomposition of Nonconvex Polynomials with Algebraic Techniques
Convex relaxations for DC decompositions (2/6)
DSOS-convex, SDSOS-convex, SOS-convex polynomials
𝑝(π‘₯)
convex
⇔ 𝐻𝑝 π‘₯ ≽ 0, βˆ€π‘₯ ⇔
𝑦 𝑇 𝐻𝑝 π‘₯ 𝑦 β‰₯ 0,
βˆ€π‘₯, 𝑦 ∈ ℝ𝑛
⇐
𝑦 𝑇 𝐻𝑝 π‘₯ 𝑦
sos/sdsos/dsos
Definitions:
β€’ 𝑝 is dsos-convex if 𝑦 𝑇 𝐻𝑝 π‘₯ 𝑦 is dsos. LP
β€’ 𝑝 is sdsos-convex if 𝑦 𝑇 𝐻𝑝 π‘₯ 𝑦 is sdsos. SOCP
β€’ 𝑝 is sos-convex if 𝑦 𝑇 𝐻𝑝 π‘₯ 𝑦 is sos. SDP
10
DC Decomposition of Nonconvex Polynomials with Algebraic Techniques
Convex relaxations for DC decompositions (3/6)
Comparison of these sets on a parametric family of polynomials:
𝑝 π‘₯1 , π‘₯2 = 2π‘₯14 + 2π‘₯24 + π‘Žπ‘₯13 π‘₯2 + 𝑏π‘₯12 π‘₯22 + 𝑐π‘₯1 π‘₯23
𝑐 = βˆ’0.5
𝑐=0
𝑐=1
𝑏
𝑏
𝑏
π‘Ž
dsos-convex
π‘Ž
sdsos-convex
π‘Ž
sos-convex=convex
11
DC Decomposition of Nonconvex Polynomials with Algebraic Techniques
Convex relaxations for DC decompositions (4/6)
How to use these concepts to do DC decomposition at a point π‘Ž?
Original problem
min πœ†π‘šπ‘Žπ‘₯ (π»β„Ž π‘Ž )
s.t. 𝑓 = 𝑔 βˆ’ β„Ž
𝑔, β„Ž convex
⇔
Relaxation 1:
sos-convex
min 𝑑
s.t. π»β„Ž π‘Ž β‰Ό 𝑑𝐼
𝑓 =π‘”βˆ’β„Ž
𝑔, β„Ž sos-convex
Relaxation 2:
sdsos-convex
min 𝑑
s.t. π»β„Ž π‘Ž β‰Ό 𝑑𝐼
𝑓 =π‘”βˆ’β„Ž
𝑔, β„Ž sdsos-convex
Relaxation 3:
dsos-convex
min 𝑑
s.t. π»β„Ž π‘Ž β‰Ό 𝑑𝐼
𝑓 =π‘”βˆ’β„Ž
𝑔, β„Ž dsos-convex
SDP
SOCP + β€œsmall” SDP
LP + β€œsmall” SDP
min 𝑑
s.t. π»β„Ž π‘Ž β‰Ό 𝑑𝐼
𝑓 =π‘”βˆ’β„Ž
𝑔, β„Ž convex
Relaxation 4:
sdsos-convex+sdd
min 𝑑
s.t. 𝒕𝑰 βˆ’ 𝑯𝒉 𝒂 sdd (**)
𝑓 =π‘”βˆ’β„Ž
𝑔, β„Ž, sdsos-convex
SOCP
Relaxation 5:
dsos-convex + dd
min 𝑑
s.t. 𝒕𝑰 βˆ’ 𝑯𝒉 𝒂 dd (*)
𝑓 =π‘”βˆ’β„Ž
𝑔, β„Ž, dsos-convex
LP
βˆ— 𝑄 is diagonally dominant (dd) ⇔ βˆ‘π‘— π‘žπ‘–π‘— < π‘žπ‘–π‘– , βˆ€π‘–
βˆ—βˆ— 𝑄 is sdd ⇔ βˆƒπ· > 0 diagonal, s.t. 𝐷𝑄𝐷 dd.
12
DC Decomposition of Nonconvex Polynomials with Algebraic Techniques
Convex relaxations for DC decompositions (5/6)
Can any polynomial be written as the difference of two dsos/sdsos/sos
convex polynomials?
Lemma about cones: Let 𝐾 βŠ† 𝐸 a full dimensional cone (𝐸, any vector
space). Then any 𝑣 ∈ 𝐸 can be written as 𝑣 = π‘˜1 βˆ’ π‘˜2 , π‘˜1 , π‘˜2 ∈ 𝐾.
=: π‘˜β€²
Proof sketch:
K
E
π’Œ
𝒗
π’Œβ€²
βˆƒ 𝛼 < 1 such that 1 βˆ’ 𝛼 𝑣 + π›Όπ‘˜ ∈ 𝐾
1
𝛼
β€²
⇔𝑣=
π‘˜ βˆ’
π‘˜
1βˆ’π›Ό
1βˆ’π›Ό
π‘˜1 ∈ 𝐾
π‘˜2 ∈ 𝐾
13
DC Decomposition of Nonconvex Polynomials with Algebraic Techniques
Convex relaxations for DC decompositions (6/6)
Theorem: Any polynomial can be written as the difference of two dsosconvex polynomials.
Corollary: Same holds for sdsos-convex, sos-convex and convex.
Proof idea:
β€’ Need to show that dsos-convex polynomials is full-dimensional cone.
β€’ β€œObvious” choices (i.e., 𝑝 π‘₯ =
𝑑/2
2
(βˆ‘π‘– π‘₯𝑖 ) )
Induction on 𝑛: for 𝑛 = 2, take
𝑝 π‘₯1 , π‘₯2 = π‘Ž0 π‘₯1𝑑 + π‘Ž1 π‘₯1π‘‘βˆ’2 π‘₯22 + β‹― +
2 π‘‘βˆ’2
𝑑
π‘Ž0 >
+
π‘Žπ‘‘
𝑑(𝑑 βˆ’ 1) 4(𝑑 βˆ’ 1) 4
π‘Ž1 = 1
do not work.
𝑑 𝑑
π‘Žπ‘‘ π‘₯12 π‘₯22
4
π‘Žπ‘˜+1
+ β‹― + π‘Ž1 π‘₯12 π‘₯2π‘‘βˆ’2 + π‘Ž0 π‘₯2𝑑
𝑑 βˆ’ 2π‘˜
𝑑
=
π‘Ž , π‘˜ = 1, … , βˆ’ 1
2π‘˜ + 2 π‘˜
4
14
DC Decomposition of Nonconvex Polynomials with Algebraic Techniques
Comparing the different relaxations (1/4)
β€’ Impact of relaxations on solving
for random 𝑓 (𝑑 = 4).
Type of relaxation
min 𝑑
𝑑,𝑔,β„Ž
s.t. 𝑑𝐼 βˆ’ π»β„Ž π‘Ž psd/sdd/dd
𝑓 = 𝑔 βˆ’ β„Ž,
𝑔, β„Ž s/d/sos-convex
𝒏=πŸ”
𝒏 = 𝟏𝟎
𝒏 = πŸπŸ”
Time (s)
Opt value
Time (s)
Opt Value
Time (s)
Opt value
dsos-convex + dd
1.05
17578.54
2.79
21191.55
20.80
168327.89
dsos-convex + psd
1.19
15855.77
3.19
19426.13
25.36
146847.73
sdsos-convex + sdd
1.21
1089.41
5.17
1962.64
34.66
7936.57
sdsos-convex + psd
1.21
1069.79
5.29
1957.03
39.43
7935.72
sos-convex + psd MOSEK
2.02
193.07
93.74
317.63
+∞
------------------
sos-convex + psd SEDUMI
11.48
193.06
10324.12
317.63
+∞
------------------
Computer:
8Gb RAM,
2.40GHz
processor
15
DC Decomposition of Nonconvex Polynomials with Algebraic Techniques
Comparing the different relaxations (2/4)
β€’ Iterative decomposition algorithm implemented for unconstrained 𝑓.
Decompose 𝒇 = π’ˆ βˆ’ 𝒉,
using one of the
relaxations at point π‘₯π‘˜
Minimize convexified π’‡π’Œ , using an SDP
subroutine [Lasserre; de Klerk and
Laurent]
DSOS PSD SDSOS SDD SDSOS PSD
SOS PSD
0
DSOS DD
-50000
-100000
-150000
-200000
β€’
β€’
β€’
β€’
β€’
β€’
Value of the objective after 3 mins.
Algorithm given above.
5 different relaxations used
𝑓 random with 𝑛 = 9, 𝑑 = 4
Average over 25 iterations
Solver: Mosek
-250000
16
DC Decomposition of Nonconvex Polynomials with Algebraic Techniques
Comparing the different relaxations (3/4)
β€’ Constrained case: min 𝑓(π‘₯) , where 𝐡 = π‘₯ βˆ‘π‘– π‘₯𝑖2 ≀ 𝑅2 }.
π‘₯∈𝐡
Single decomposition
vs
Iterative decomposition
vs
One min-max decomp.
Relaxation: min 𝑑
Decompose
s.t. π»β„Ž π‘Žπ’‡ =
β‰Όπ’ˆ
π‘‘πΌβˆ’ 𝒉,
once
𝑓
= 𝑔atβˆ’π‘₯β„Ž0
𝑔, β„Ž sdsos convex
Relaxation: min 𝑑
Decompose
s.t. π»β„Ž π‘Žπ’‡ =
β‰Όπ’ˆ
π‘‘πΌβˆ’ 𝒉,
at𝑓 a=point
π‘”βˆ’π‘₯
β„Žπ‘˜
𝑔, β„Ž sdsos convex
Decompose
𝒇 =to
π’ˆ use?
βˆ’π’‰
What
relaxation
over B
Minimize convexified π’‡π’Œ
Minimize convexified π’‡π’Œ
Minimize convexified π’‡π’Œ
Second
relaxation:
Equivalent
formulation:
First relaxation:
Original
problem:
min
min
𝑑𝑑
𝑑,𝑔,β„Ž
min max𝑑,𝑔,β„Ž
πœ†π‘šπ‘Žπ‘₯𝟐 (π»β„Ž π‘₯𝟐 )
𝑔,β„Ž
π‘₯∈Ω
𝒕𝑰 𝒙
βˆ’
𝑯
𝒙
≽
𝒉
π‘₯∈ 𝑩
𝐡 β‡’ 𝒕𝑰
𝑑𝐼 βˆ’π‘Ήπ‘―
π»β„Žπ’‰βˆ’π‘₯π’™βˆ‘π’™β‰½π’Š 0πŸŽπ‰(𝒙)
s.t.
= 𝑔 βˆ’ β„Ž,
π’šπ‘“π‘»π‘“π‰(𝒙)π’š
= 𝑔 βˆ’ sos
β„Ž
𝑔,𝑓 β„Ž=convex
π‘”βˆ’β„Ž
β„Ž convex
π’ˆ, 𝒉𝑔,sdsos-convex
𝑔, β„Ž sdsos-convex
17
DC Decomposition of Nonconvex Polynomials with Algebraic Techniques
Comparing the different relaxations (4/4)
β€’ Constrained case: single decomposition vs. iterative decomposition
vs. min-max decomposition
4000
2000
0
-2000
-4000
-6000
-8000
-10000
-12000
Single
decomp
Iter decomp
Min max
β€’
β€’
β€’
β€’
Value of the objective after 3 mins.
Algorithms described above.
𝑓 random with 𝑛 = 10, 𝑑 = 4
Radius 𝑅 random integer between
100 and 400.
β€’ Average over 200 iterations
-14000
-16000
18
DC Decomposition of Nonconvex Polynomials with Algebraic Techniques
Main messages
β€’ To apply CCCP to polynomial optimization, a DC decomposition is
needed. Choice of decomposition impacts convergence speed.
β€’ Not computationally tractable to find β€œbest” decomposition.
β€’ Efficient convex relaxations based on the concepts of dsos-convex
(LP), sdsos-convex (SOCP), and sos-convex (SDP) polynomials.
β€’ Dsos-convex and sdsos-convex scale to a larger number of variables.
19
Thank you for listening
Questions?
20