Linear Programming (Optimization)

Transcript Linear Programming (Optimization)

Chapter 9. Interior Point Methods



Three major variants

 Affine scaling algorithm - easy concept, good performance  Potential reduction algorithm - poly time  Path following algorithm - poly time, good performance, theoretically elegant Linear Programming 2013 1



9.4 The primal path following algorithm

min 𝑐 ′ 𝑥 𝐴𝑥 = 𝑏 𝑥 ≥ 0 max 𝑝 ′ 𝑏 𝑝 ′ 𝐴 + 𝑠 ′ = 𝑐′ 𝑠 ≥ 0 Nonnegativity makes the problem difficult, hence use barrier function in the objective and consider unconstrained problem ( in the affine space 𝐴𝑥 = 𝑏, 𝑝 ′ 𝐴 + 𝑠 ′ = 𝑐′)  Barrier function : 𝐵 𝜇 𝐵 𝜇 𝑥 = 𝑐 ′ 𝑥 − 𝜇 𝑛 𝑗=1 log 𝑥 𝑗 , 𝜇 > 0 𝑥 ≡ +∞ if 𝑥 𝑗 ≤ 0 for some 𝑗 Solve min 𝐵 𝜇 𝑥 , s.t.

𝐴𝑥 = 𝑏 (9.15) 𝐵 𝜇 (𝑥) is strictly convex, hence has unique min point if min exists.

Linear Programming 2013 2

ex) min 𝑥 , s. t. 𝑥 ≥ 0 𝐵 𝜇 𝑥 = 𝑥 − 𝜇 log 𝑥 1 − 𝜇 𝑥 = 0  , min at 𝑥 = 𝜇 𝐵  (𝑥) Linear Programming 2013 0 1  − log 𝑥 𝑥 3

 min s.t.

𝐵 𝜇 𝑥 = 𝑐 ′ 𝑥 − 𝜇 𝑛 𝑗=1 log 𝑥 𝑗 𝐴𝑥 = 𝑏 Let 𝑥(𝜇) is optimal solution given 𝜇 > 0 .

𝑥(𝜇) , when 𝜇 varies, is called the central path ( hence the name path following ) It can be shown that lim 𝜇→0 𝑥 𝜇 = 𝑥 ∗ optimal solution to LP. When 𝜇 = ∞ , 𝑥(𝜇) is called the analytic center .

 For dual problem, the barrier problem is max 𝑝 ′ 𝑏 + 𝜇 𝑛 𝑗=1 log 𝑠 𝑗 , s.t.

𝑝 ′ 𝐴 + 𝑠 ′ = 𝑐′ (9.16) ( equivalent to min −𝑝 ′ 𝑏 − 𝜇 𝑛 𝑗=1 log 𝑠 𝑗 , minimizing convex function) Linear Programming 2013 4

𝑥 ∗ 𝑥(0.01) 𝑥(1) 𝑥(0.1) 𝑥(10) analytic center Figure 9.4: The central path and the analytic center Linear Programming 2013 𝑐 5

 Results from nonlinear programming (NLP) min 𝑓(𝑥) s.t.

𝑔 𝑖 (𝑥) ≤ 0 , 𝑖 = 1, … , 𝑚 ℎ 𝑖 𝑓, 𝑔 𝑖 , ℎ 𝑖 : 𝑅 𝑛 𝑥 = 0 , → 𝑅 𝑖 = 1, … , 𝑝 , all twice continuously differentiable ( gradient is given as a column vector)  Thm (Karush 1939, Kuhn-Tucker 1951, first order necessary optimality condition) If 𝑥 ∗ is a local minimum for (NLP) and some conditions (called constraint qualification) hold at 𝑥 ∗ , then there exist 𝑚 + , 𝑝 such that (1) (2) 𝛻𝑓 𝑥 ∗ + 𝑚 𝑖=1 𝑖 𝑥 ∗ 𝑖 𝛻𝑔 𝑖 (𝑥 ∗ ) + 𝑝 𝑖=1 𝑣 𝑖 𝛻ℎ ≤ 0, 𝑖 = 1, … , 𝑚, 𝑚 𝑖=1 𝑖 (𝑥 𝑖 𝑔 ∗ 𝑖 ) = 0 𝑥 ∗ = 0 (3) ℎ 𝑖 𝑥 ∗ = 0, 𝑖 = 1, … , 𝑝 Linear Programming 2013 6

 Remark:   (2) is CS conditions and it implies that (1) says 𝛻𝑓 𝑥 ∗ constraints and is a nonnegative linear combination of 𝛻ℎ 𝑖 𝑥 ∗ 𝑖 = 0 for non-active constraint 𝑔 𝑖 .

−𝛻𝑔 𝑖 𝑥 ∗ for active (compare to strong duality theorem in p. 173 and its Figure )  CS conditions for LP are KKT conditions  KKT conditions are necessary conditions for optimality, but it is also sufficient in some situations. One case is when objective function is convex and constraints are linear, which includes our barrier problem.

Linear Programming 2013 7

 Deriving KKT for barrier problem:  min 𝐵 𝜇 𝑥 = 𝑐 ′ 𝑥 − 𝜇 𝑛 𝑗=1 log 𝑥 𝑗 , s.t.

𝐴𝑥 = 𝑏 (𝑥 > 0) 𝛻𝑓 𝑥 = 𝑐 − 𝜇𝑋 −1 𝑒, 𝛻ℎ 𝑖 𝑥 = 𝑎 𝑖 ( 𝑎 𝑖 is 𝑖 − 𝑡ℎ row vector of 𝐴 , expressed as a column vector and 𝑋 −1 = 𝑑𝑖𝑎𝑔 1 𝑥 1 , … , 1 𝑥 𝑛 , 𝑒 is the vector having 1 in all components.) Using (Lagrangian) multiplier 𝑝 𝑖 (ignoring the sign of 𝑝 ) for 𝛻ℎ 𝑖 (𝑥) , we get 𝑐 − 𝜇𝑋 −1 𝑒 = 𝐴 ′ 𝑝 Note that ℎ 𝑖 𝑥 = 𝑎 𝑖 ′𝑥 − 𝑏 𝑖 and 𝛻ℎ 𝑖 𝑥 = 𝑎 𝑖 . ( ℎ 𝑥 = 𝐴𝑥 − 𝑏: 𝑅 𝑛 If we define 𝑠 = 𝜇𝑋 −1 𝑒 , KKT becomes → 𝑅 𝑚 ) 𝐴 ′ 𝑝 + 𝑠 = 𝑐, 𝐴𝑥 = 𝑏, 𝑋𝑆𝑒 = 𝜇𝑒, ( 𝑥 > 0, 𝑠 > 0 ), where 𝑆 = 𝑑𝑖𝑎𝑔 𝑠 1 , … , 𝑠 𝑛 .

Linear Programming 2013 8

 For dual barrier problem, min −𝑝 ′ 𝑏 − 𝜇 𝑛 𝑗=1 log 𝑠 𝑗 , s.t.

𝑝 ′ 𝐴 + 𝑠 ′ = 𝑐′ ( 𝑠 > 0 ) 𝛻𝑓 𝑝, 𝑠 = −𝑏 −𝜇𝑆 −1 𝑒 𝛻ℎ 𝑗 𝑝, 𝑠 = 𝐴 𝑗 𝑒 𝑗 ( 𝐴 𝑗 is 𝑗 − 𝑡ℎ column vector of 𝐴 and 𝑒 𝑗 is 𝑗 − 𝑡ℎ unit vector.) Using (Lagrangian) multiplier −𝑥 𝑗 for 𝛻ℎ 𝑗 (𝑝, 𝑠) , we get −𝑏 −𝜇𝑆 −1 𝑒 = 𝑛 𝑗=1 −𝑥 𝑗 𝐴 𝑗 𝑒 𝑗 Now 𝑗 𝑥 𝑗 𝑒 𝑗 = 𝑋𝑒 , hence we have the conditions 𝐴 ′ 𝑝 + 𝑠 = 𝑐, 𝐴𝑥 = 𝑏, 𝑋𝑆𝑒 = 𝜇𝑒 , (𝑥 > 0, 𝑠 > 0) which is the same conditions we obtained from the primal barrier function.

Linear Programming 2013 9

 The conditions are given in the text as 𝐴𝑥 𝜇 = 𝑏, 𝐴 ′ 𝑝 𝜇 + 𝑠 𝜇 = 𝑐, 𝑋 𝜇 𝑆 𝜇 𝑒 = 𝑒𝜇, 𝑥(𝜇) ≥ 0 𝑠(𝜇) ≥ 0 (9.17) where 𝑋 𝜇 = 𝑑𝑖𝑎𝑔 𝑥 1 𝜇 , … , 𝑥 𝑛 (𝜇) , 𝑆 𝜇 = 𝑑𝑖𝑎𝑔 𝑠 1 𝜇 , … , 𝑠 𝑛 (𝜇) . Note that when 𝜇 = 0 , they are primal, dual feasibility and complementary slackness conditions.

 Lemma 9.5: If 𝑥 ∗ , 𝑝 ∗ , and 𝑠 ∗ satisfy conditions (9.17), then they are optimal solutions to problems (9.15) and (9.16) Linear Programming 2013 10

 Pf) Let 𝑥 ∗ , 𝑝 ∗ , and 𝑠 ∗ satisfy (9.17), and let satisfies 𝑥 ≥ 0 and 𝐴𝑥 = 𝑏 . Then 𝑥 be an arbitrary vector that 𝐵 𝜇 𝑥 = 𝑐 ′ 𝑥 − 𝜇 𝑛 𝑗=1 log 𝑥 𝑗 = 𝑐 ′ 𝑥 − 𝑝 ∗ ′ 𝐴𝑥 − 𝑏 − 𝜇 𝑛 𝑗=1 log 𝑥 𝑗 = 𝑠 ∗ ′𝑥 + 𝑝 ∗ ′𝑏 − 𝜇 𝑛 𝑗=1 log 𝑥 𝑗 𝑠 𝑗 ∗ 𝑥 𝑗 ≥ 𝜇𝑛 + 𝑝 − 𝜇 log 𝑥 𝑗 ∗ ′𝑏 − 𝜇 𝑛 𝑗=1 log 𝜇/𝑠 𝑗 ∗ attains min at 𝑥 𝑗 = 𝜇/𝑠 𝑗 ∗ .

equality holds iff 𝑥 𝑗 = 𝜇 𝑠 𝑗 ∗ = 𝑥 𝑗 ∗ Hence 𝐵 𝜇 𝑥 ∗ ≤ 𝐵 𝜇 (𝑥) optimal solution and 𝑥 ∗ for all feasible 𝑥 . In particular, 𝑥 ∗ = 𝑥(𝜇) .

is the unique Similarly for 𝑝 ∗ and 𝑠 ∗ for dual barrier problem.

 Linear Programming 2013 11

Primal path following algorithm

 Starting from some 𝜇 0 and primal and dual feasible 𝑥 0 solution of the barrier problem iteratively while 𝜇 → 0 .

> 0, 𝑠 0 > 0, 𝑝 0 , find  To solve the barrier problem, we use quadratic approximation (2 nd order Taylor expansion) of the barrier function and use the minimum of the approximate function as the next iterates.

Taylor expansion is 𝐵 𝜇 𝑥 + 𝑑 ≈ 𝐵 𝜇 = 𝐵 𝜇 𝑥 + 𝑛 𝑖=1 𝑥 + 𝑐 ′ 𝜕𝐵 𝜇 𝜕𝑥 𝑖 𝑥 − 𝜇𝑒′𝑋 𝑑 𝑖 + 1 2 𝑛 𝑖,𝑗=1 𝜕 2 𝐵 𝜇 𝑥 𝜕𝑥 𝑖 𝜕𝑥 𝑗 −1 𝑑 + 1 2 𝜇𝑑′𝑋 −2 𝑑 𝑑 𝑖 𝑑 𝑗 Also need to satisfy 𝐴 𝑥 + 𝑑 = 𝑏  𝐴𝑑 = 0 Linear Programming 2013 12

 Using KKT, solution to this problem is 𝑑 𝜇 = 𝐼 − 𝑋 2 𝐴′ 𝐴𝑋 2 𝐴′ −1 𝐴 𝑝 𝜇 = 𝐴𝑋 2 𝐴′ −1 𝐴 𝑋 2 𝑐 − 𝜇𝑋𝑒 𝑋𝑒 − 1 𝜇 𝑋 2 𝑐 The duality gap is 𝑐 ′ 𝑥 − 𝑝 ′ 𝑏 = 𝑝 ′ 𝐴 + 𝑠 ′ 𝑥 − 𝑝 ′ 𝐴𝑥 Hence stop the algorithm if 𝑠 𝑘 ′𝑥 𝑘 < 𝜀 = 𝑠 ′ 𝑥 Need a scheme to have an initial feasible solution (see text) Linear Programming 2013 13

 The primal path following algorithm 1. (Initialization) Start with some primal and dual feasible 𝑥 0 0, 𝑝 0 , and set 𝑘 = 0 .

> 0, 𝑠 0 > 2. (Optimality test) If 𝑠 𝑘 ′𝑥 𝑘 < 𝜀 , stop; else go to Step 3.

3. Let 𝑋 𝑘 = 𝑑𝑖𝑎𝑔 𝑥 1 𝑘 , … , 𝑥 𝑘 𝑛 , 𝜇 𝑘+1 = 𝛼𝜇 𝑘 0 < 𝛼 < 1 4. (Computation of directions) Solve the linear system 𝜇 𝑘+1 𝑋 𝑘 −2 𝑑 − 𝐴 ′ 𝑝 = 𝜇 𝑘+1 𝑋 𝑘 −1 𝑒 − 𝑐 , 𝐴𝑑 = 0 , for 𝑝 and 𝑑 .

5. (Update of solutions) Let 𝑥 𝑘+1 = 𝑥 𝑘 + 𝑑 , 𝑝 𝑘+1 = 𝑝 , 𝑠 𝑘+1 = 𝑐 − 𝐴 ′ 𝑝 .

6. Let 𝑘 ≔ 𝑘 + 1 and go to Step 2.

Linear Programming 2013 14

9.5 The primal-dual path following algorithm

 Find Newton directions both in the primal and dual space.

Instead of finding min of quadratic approximation of barrier function, it finds the solution for KKT system.

𝐴𝑥 𝜇 = 𝑏, 𝐴 ′ 𝑝 𝜇 + 𝑠 𝜇 = 𝑐, 𝑋 𝜇 𝑆 𝜇 𝑒 = 𝑒𝜇, 𝑥(𝜇) ≥ 0 𝑠 𝜇 ≥ 0 (9.26) System of nonlinear equations because of the last ones.

 Let 𝐹: 𝑅 𝑟 → 𝑅 𝑟 . Want 𝑧 ∗ such that 𝐹 𝑧 ∗ = 0 We use first order Taylor approximation around 𝑧 𝑘 , 𝐹 𝑧 𝑘 + 𝑑 ≈ 𝐹 𝑧 𝑘 + 𝐽 𝑧 𝑘 𝑑.

Here 𝐽 𝑧 𝑘 is the 𝑟 × 𝑟 Jacobian matrix whose 𝑖, 𝑗 

F i



z ( j z ) z



z k

-th element is given by 𝜕𝐹 𝑖 𝜕𝑧 𝑗 𝑧 | 𝑧=𝑧 𝑘 Linear Programming 2013 15

 Try to find 𝑑 that satisfies 𝐹 𝑧 𝑘 + 𝐽 𝑧 𝑘 𝑑 = 0 . We then set 𝑧 𝑘+1 = 𝑧 𝑘 + 𝑑 . The direction 𝑑 is called a Newton direction.

Here F(z) is given by 𝐹 𝑧 = 𝐴𝑥 − 𝑏 𝐴 ′ 𝑝 + 𝑠 − 𝑐 𝑋𝑆𝑒 − 𝜇𝑒  𝑆 𝐴 0 𝑘 0 𝐴′ 0 𝑋 0 𝐼 𝑘 𝑑 𝑘 𝑥 𝑑 𝑘 𝑝 𝑑 𝑠 𝑘 = − 𝐴′𝑝 𝑘 𝑋 𝑘 𝐴𝑥 𝑘 𝑆 𝑘 − 𝑏 + 𝑠 𝑘 𝑒 − 𝜇 − 𝑐 𝑘 This is equivalent to 𝐴𝑑 𝑘 𝑥 𝐴𝑑 𝑘 𝑝 𝑆 𝑘 𝑑 𝑘 𝑥 + 𝑑 + 𝑋 𝑘 𝑑 𝑘 𝑠 𝑘 𝑠 = 0 = 0 = 𝜇 𝑘 𝑒 − 𝑋 𝑘 𝑆 𝑘 𝑒 𝑒 (9.28) (9.29) (9.30) Linear Programming 2013 16

 Solution to the previous system is 𝑑 𝑘 𝑥 = 𝑘 𝐼 − 𝑃 𝑘 𝑣 𝑘 𝜇 𝑘 , where 𝑑 𝑘 𝑝 = − 𝐴 𝑘 2 𝐴 ′ −1 𝐴 𝑘 𝑣 𝑘 𝑑 𝑘 𝑠 = 𝑘 −1 𝑃 𝑘 𝑣 𝑘 𝜇 𝑘 , 𝑣 𝑘 𝜇 𝑘 , 𝜇 𝑘 2 = 𝑋 𝑘 𝑆 𝑘 −1 , 𝑃 𝑘 = 𝑘 𝐴 ′ 𝐴 2 𝑘 𝐴 ′ −1 𝐴 𝑘 , 𝑘 = 𝑋 𝑘 −1 𝑘 𝜇 𝑘 𝑒 − 𝑋 𝑘 𝑆 𝑘 𝑒 .

Also limit the step length to ensure 𝑥 𝑘+1 > 0, 𝑠 𝑘+1 > 0.

Linear Programming 2013 17

 The primal-dual path following algorithm 1. (Initialization) Start with some feasible 𝑥 0 > 0, 𝑠 0 > 0, 𝑝 0 , and set 𝑘 = 0 .

2. (Optimality test) If 𝑠 𝑘 ′𝑥 𝑘 < 𝜀 , stop; else go to Step 3.

3. (Computation of Newton directions) Let 𝜇 𝑘 = 𝑠 𝑘 ′𝑥 𝑘 /𝑛 , 𝑋 𝑆 𝑘 𝑘 = 𝑑𝑖𝑎𝑔 𝑥 1 𝑘 , … , 𝑥 𝑘 𝑛 = 𝑑𝑖𝑎𝑔 𝑠 1 𝑘 , … , 𝑠 𝑛 𝑘 .

, Solve the linear system (9.28) – (9.30) for 𝑑 𝑘 𝑥 , 𝑑 𝑘 𝑝 , 4. (Find step lengths) Let and 𝑑 𝑠 𝑘 .

𝛽 𝑃 𝑘 = 𝑚𝑖𝑛 1, 𝛼 min 𝑖| 𝑑 𝑘 𝑥 𝑖 <0 − 𝑥 𝑖 𝑘 𝑑 𝑘 𝑥 𝑖 , 𝛽 𝑘 𝐷 = 𝑚𝑖𝑛 1, 𝛼 min 𝑖| 𝑑 𝑘 𝑠 𝑖 <0 0 < 𝛼 < 1 Linear Programming 2013 − 𝑠 𝑖 𝑘 𝑑 𝑘 𝑠 𝑖 , 18

(continued) 5. (Solution update) Update the solution vectors according to 𝑥 𝑘+1 = 𝑥 𝑘 + 𝛽 𝑘 𝑃 𝑑 𝑘 𝑥 , 𝑝 𝑘+1 𝑠 𝑘+1 = 𝑝 𝑘 = 𝑠 𝑘 + 𝛽 𝑘 𝐷 𝑑 𝑘 𝑝 , + 𝛽 𝑘 𝐷 𝑑 𝑠 𝑘 , 6. Let 𝑘 ≔ 𝑘 + 1 and go to Step 2.

Linear Programming 2013 19

Infeasible primal-dual path following methods

 A variation of primal-dual path following.

Starts from 𝑥 0 > 0, 𝑠 0 > 0, 𝑝 0 , the primal or the dual, i.e. 𝐴𝑥 0 which is not necessarily feasible for either ≠ 𝑏 and/or 𝐴′𝑝 0 + 𝑠 0 ≠ 𝑐 .

Iteration same as the primal-dual path following except feasibility not maintained in each iteration.

Excellent performance.

Linear Programming 2013 20

Self-dual method

 Alternative method to find initial feasible solution w/o using big-M.

Given an initial possibly infeasible point consider the problem 𝑥 0 , 𝑝 0 , 𝑠 0 with 𝑥 0 > 0 and 𝑠 0 > 0 , minimize 𝑥 0 ′𝑠 0 + 1 𝜃 subject to −𝐴 ′ 𝑝 𝑏 ′ 𝑝 − 𝑐 ′ 𝑥 +𝑐𝜏 − 𝑐𝜃 − 𝑠 + 𝑧𝜃 − 𝜅 −𝑏 ′ 𝑝 + 𝑐 ′ 𝑥 − 𝑧𝜏 = 0 = 0 = 0 = − 𝑥 0 ′𝑠 0 + 1 𝑥 ≥ 0, 𝜏 ≥ 0, 𝑠 ≥ 0, 𝜅 ≥ 0, where 0 , 𝑐 = 𝑐 − 𝐴′𝑝 0 − 𝑠 0 , 𝑧 = 𝑐′𝑥 0 + 1 − 𝑏′𝑝 0 .

(9.33) This LP is self-dual .

Note that 𝑥, 𝑝, 𝑠, 𝜏, 𝜃, 𝜅 = 𝑥 0 , 𝑝 0 , 𝑠 0 , 1,1,1 (9.33) Linear Programming 2013 is a feasible interior solution to 21

 Since both the primal and dual are feasible, they have optimal solutions and the optimal value is 0.

 Primal-dual path following method finds an optimal solution 𝑥 ∗ , 𝑝 ∗ , 𝑠 ∗ , 𝜏 ∗ , 𝜃 ∗ , 𝜅 ∗ that satisfies 𝜃 ∗ = 0, 𝑥 ∗ + 𝑠 ∗ > 0, 𝜏 ∗ + 𝜅 ∗ > 0, 𝑠 ∗ ′𝑥 ∗ = 0, 𝜏 ∗ 𝜅 ∗ = 0 ( satisfies strict complementarity )  Can find optimal solution or determine unboundedness depending on the value of 𝜏 ∗ , 𝜅 ∗ . (see Thm 9.8)  Running time : worst case : 𝑂 𝑛 log 𝜀 0 /𝜀 observed : 𝑂 log 𝑛 log 𝜀 0 /𝜀 Linear Programming 2013 22