Avoiding numerical cancellation in the interior point

Download Report

Transcript Avoiding numerical cancellation in the interior point

Improving Performance of
The Interior Point Method
by Preconditioning
Project Proposal by: Ken Ryals
For: AMSC 663-664
Fall 2007-Spring 2008
Background
The IPM method solves a sequence of constrained
optimization problems such that the sequence of
solutions approaches the true solution from within
the “valid” region. As the constraints are “relaxed”
and the problem re-solved, the numerical properties
of the problem often become more “interesting”.
2
Application
Why is the IPM method of interest?
– It applies to a wide range of problem types:
• Linear Constrained Optimization
• Semidefinite Problems
• Second Order Cone Problems
– Once in the “good region” of a solution to the set
of problems in the solution path (of µ ‘s):
• Convergence properties are great (“quadratic”).
• It keeps the iterates in the “valid” region.
Specific Research Problem:
– Optimization of Distributed Command and Control
3
Optimization Problem
The linear optimization problem can be formulated follows:
inf{ cTx | Ax = b}.
The search direction is implicitly defined by the system:
Δx + π Δz = r
A Δx = 0
AT Δy + Δz = 0.
x is the unknown
y is the “dual” of x
z is the “slack”
For this, the Reduced Equation is:
A π AT Δy = −Ar (= b)
 From Δy we can get Δx = r − π ( −AT Δy ).
Note: The Nesterov-Todd direction corresponds to π = D⊗D, where: π z = x.
i.e., D is the metric geometric mean of X and Z−1 D = X½(X½ZX½)−½X½
4
The Problem
From these three equations, the Reduced Equations for Δy are:
A π AT Δy = −Ar (= b)
The optimization problem is “reduced” to solving a system of
linear equations to generate the next solution estimate.
If π cannot be evaluated with sufficient accuracy, solving these
equations becomes pointless due to error propagation.
Aside:
Numerically, evaluating ‘r − π Δz’ can also be a challenge. Namely, if the
iterates converge, then Δx (= r − π Δz) approaches zero, but r does not;
hence the accuracy in Δx can suffer from problems, too.
5
Example: A Poorly-Conditioned Problem
Consider a simple problem:
  2 1 1 0 0
6 4 0
T 



Constraint Matrix  A    1 2 0 1 0 A * A  4 6 3
 1 2 0 0 1
0 3 6
1
 ( A * AT )   6 
 
11
Condition( A * AT )  11
Let’s change A1,1 to make it ill-conditioned:
 2 *107

A   1
 1

1 1 0 0
400000000000002 - 19999998 20000002

T 

2 0 1 0 A * A 
- 19999998
6
3


2 0 0 1
20000002
3
6



 400000000000004 
 ( A * A )   9.03137361014904


 1.03112946485009 
T
Condition( A * AT )  387924129447855  3.88e + 014
Solution Parameter Paths
15
4
2
30
14
10
16
10
1
10
25
10
20
10
15
10
10
10
5
10
0
13
6
11
Iteration Number
D2 Condition Number
10
Condition Number
6
0
1
35
10
Condition Number
Parameter Values
8
ADAT Condition Number
6
11
Iteration Number
16
10
1
6
11
Iteration Number
16
6
Observations
• The AD2AT condition exhibits an interesting dip
around iteration 4-5, when the solution enters the
region of the answer.
• How can we exploit whatever caused the dip?
– The standard approach is to use factorization to improve
numerical performance.
• The Cholesky factorization is: UTU = A
π AT
– Factoring a matrix into two components often trades one matrix with
a condition of “M” for two matrices with conditions of ≈“√M”.
– My conjecture is that AAT and D2 interacted to lessen the
impact of the ill conditioning in A
⇒Can we precondition with AAT somehow?
7
Conjecture - Math
We are solving: A π AT Δy = −Ar
A is not square, so it isn’t invertible; but AAT is…
What if we pre-multiplied by it?
(AAT)-1 A π AT Δy = − (AAT)-1 Ar
Note: Conceptually, we have:
T)-1
T)A
-1 π AT Δy = − (AAT)-1 b
Ar
(AA(A
Since, this looks like a similarity transform,
it might have “nice” properties…
8
Conjecture - Numbers
Revisit the ill-conditioned simple problem,
Condition of A π AT (π is D2) = 4.2e+014
Condition of (AAT) = 4.0e+014
Condition of (AAT)-1 A D2 AT = 63.053
(which is a little less than 1014)
How much would it cost?
AAT is m by m (m = # constraints)
neither AAT or(AAT)-1 is likely to be sparse…
Let’s try it anyway…
(If it behaves nicely, it might be worth figuring out how to do it efficiently)
9
Experiment - Results
The inverse is needed later, rather
than early in the process; thus, it
could be iteratively developed
during the iteration process…
7
Parameter Values
It costs more ☹
• Need inverse of AAT once.
• (AAT)-1 gets used every iteration.
Solution Parameter Paths
6
5
4
3
2
1
0
1
6
11
Iteration Number
16
(AAT)-1ADAT Condition Number
3
10
Condition Number
It does work ☺
• The condition number stays low
(<1000) instead of hovering in the
1014 range.
2
10
1
10
Solution enters region of
“Newton” convergence
0
10
1
6
11
Iteration Number
16
10
Project Proposal
1.
Develop a system to for preconditioned IPM.
•
Create Matlab version to define structure of system.
•
•
Permits validation against SeDuMi and SPDT3
Use C++ transition from Matlab to pure C++.
•
Create MEX modules from C++ code to use in Matlab
2.
Apply the technique to the Distributed C2 problem
3.
Modify the system to develop (A*AT)-1 iteratively or to solve the system of
equations iteratively
•
Can we use something like the Sherman-Morrison-Woodbury Formula?
(A - ZVT )-1 = A-1 + A-1Z(I - VTA-1Z)-1VTA-1
•
4.
Can the system can be solved using the Preconditioned Conjugate Gradient Method?
Time permitting, “parallel-ize” the system
–
Inverse generation branch, and
–
Iteration branch using current version of Inverse.
Testing: Many test optimization problems can be found online
“AFIRO” is a good start (used in AMSC 607 – Advanced Numerical Optimization)
11