Multiple Instance Classification

Transcript Multiple Instance Classification

Exact Differentiable Exterior
Penalty for Linear Programming
February 12, 2008
Olvi Mangasarian
UW Madison & UCSD La Jolla
Edward Wild
UW Madison
Dedication
• This talk is dedicated to the memory of Herb Keller 1925-2008
• World renowned scholar and numerical analyst
• Author of several texts, research monographs, and more than
140 research papers.
• Directed the dissertations of 25 PhD students
• Oldest bicyclist to negotiate Torrey Pines incline
• Splendid and enjoyable colleague and office mate
Preliminaries
• Exterior penalty functions convert constrained problems to optimization
unconstrained optimization problems
• Two types of penalty functions
– Nondifferentiable penalty functions
• Exact: Penalty parameter remains finite
– Differentiable penalty functions
• Asymptotic: Penalty parameter approaches infinity
• Are there exact exterior penalty functions that are differentiable?
– Yes for linear programs
• Which is the topic of this talk
Outline
• Sufficient primal LP solution exactness condition based on
differentiable dual exterior penalty function
• Exact primal LP solution computation from inexact dual
exterior penalty function
• Independence of dual penalty function on penalty parameter
• Generalized Newton algorithm & its convergence
• DLE: Direct Linear Equation algorithm & its convergence
• Computational results
• Conclusion & outlook
The Primal & Dual Linear Programs
Primal linear program
Dual linear program
The Dual Exterior Penalty Problem
where, (-u)+=max{-u,0}
Divide by 2 and let:
Penalty problem becomes:
Exact Primal Solution Computation
Any solution of the dual penalty problem:
generates an exact solution y of the primal LP:
for sufficiently large but finite  as follows:
In addition this solution minimizes:
over the solution set of the primal LP.
Ref: OLM,Journal of Machine Learning Research 2006, 1517-1530
Optimality Condition for Dual Exterior Penalty Problem
& Exact Primal LP Solution
A nasc for solving the dual penalty problem:
where P2 Rm£ m is a diagonal matrix of ones and zeros defined
as follows:
Solving for u gives:
which gives the following exact primal solution,
Sufficient Exactness Condition for Penalty
Parameter 
• Note that in
y = B0((BB0+P)\b) + (B0((BB0+ P) \ (Bd)) - d)
y depends on  only through
– The implicit dependence of P on u
– The explicit dependence on  above
•
Thus,
if

is
sufficiently
large
to
ensure
y
is
an
Assumed to hold
exact solution of the linear program, then
– P (i.e., the active constraint set) does not change with
increasing 
– B0((BB0+ P) \ (Bd)) - d = 0
Ensured computationally
Generalized Newton Algorithm
• Solve the unconstrained problem
f(u) = -b0u + ½(||B0u - d||2 + ||(-u)+||2)
using a generalized Newton method
• Ordinary Newton method requires gradient and
Hessian to compute the Newton direction
-(r2f(u))-1rf(u), but f is not twice differentiable
• Instead of ordinary Hessian, we use the
generalized Hessian, ∂2f(u) and the generalized
Newton direction (∂2f(u))-1rf(u)
– rf(u) = -b + B(B0u - d) - (-u)+
– ∂2f(u) = BB0 + diag(sign((-u)+))
Generalized Newton Algorithm (JMLR 2006)
minimize f(u) = -b0 u + ½(||B0u - d||2 + ||(-u)+||2)
1) ui + 1 = ui + i ti
• ti = i(∂2f(ui))-1rf(ui) (generalized Newton direction)
• i = max {1, ½, ¼, …} s.t.
f(ui) - f(ui + i ti) ¸ -i ¼ rf(ui)0 ti
(Armijo stepsize)
2) Stop if ||rf(ui)|| · tol & ||B0((BB0+Pi) \ (Bd)) - d|| · tol
• Pi = diag(sign((-ui)+))
3) If i = imax then  ! 10, imax ! 2 ¢ imax
4) i ! i + 1 and go to (1)
Generalized Newton Algorithm
Convergence
• Assume tol = 0
• Assume B0((BB0+ P) \ (Bd)) - d = 0 implies that  is large
enough that an exact solution to the primal is obtained
• Then either
– The Generalized Newton Algorithm terminates at ui
such that y = B0ui - d is an exact solution to the
primal, or
– For any accumulation point ū of the sequence of
iterates {ui}, y = B0ū -d is an exact solution to the
primal
• Exactness condition is incorporated as a termination
criterion
Direct Linear Equation Algorithm
• f(u) = -b0u + ½(||B0u - d||2 + ||(-u)+||2)
• rf(u) = -b + B(B0u - d) - (-u)+
= -b + B(B0u - d) + Pu
• rf(u) = 0 , u = (BB0 + P)-1(Bd + b)
• Successively solve rf(u) = 0 for
updated values of the diagonal matrix
P = diag(sign((-u)+))
Direct Linear Equation Algorithm
minimize f(u) = -b0u + ½(||B0u - d||2 + ||(-u)+||2)
1) Pi = diag(sign((-ui)+))
2) ui+1 = (BB0 + Pi) \ (b + Bd)
3) ui+1 ! ui + i (ui+1 - ui)
• i is the Armijo stepsize
4) Stop if ||ui+1 - ui|| · tol & ||B0((BB0+Pi) \ (Bd)) - d|| · tol
5) If i = imax then  ! 10, imax ! 2 ¢ imax
6) i ! i + 1 and go to (1)
Direct Linear Equation Algorithm
Convergence
• Assume tol = 0
• Assume B0((BB0+ P) \ (Bd)) - d = 0 implies that  is
large enough that an exact solution to the primal is
obtained, and that each matrix in the sequence
{BB0+ Pi} is nonsingular
• Then either
– The Direct Linear Equation Algorithm terminates at ui such
that y = B0ui - d is an exact solution to the primal, or
– For any accumulation point ū of the sequence of iterates {ui},
y = B0ū -d is an exact solution to the primal
• Exactness condition is incorporated as a termination
criterion
Solving Primal LPs with More
Constraints than Variables
• Difficulty: factoring BB0
• Solution: get exact solution to the dual which requires
factoring a smaller matrix the size of B0B
• Given an exact solution of the dual, find the exact
solution of the primal by solving
where B1 and B2 correspond to u1 > 0 and u2 = 0
– Requires factoring matrices only of size B0B
More-Constraints-than-Variables Case
Dual LP
Primal LP
Primal exterior penalty problem
Exact Solution of Dual LP from
Primal LP Penalty
For sufficiently large but finite penalty parameter , a solution y
of the primal LP penalty function gives an exact solution
u = (-By +  b)+ to the dual LP.
Furthermore, this exact dual solution minimizes ||u||2 over the
solution set of the dual linear program.
From this exact dual solution u an exact solution of the primal
LP is found by the unconstrained minimization problem:
where B1 and B2 correspond to u1>0 and u2=0 respectively.
Computational Details
• Cholesky factorization used for both methods
– Ensure factorizability by adding a small
multiple of the identity matrix
– For example, BB0 + P +  I for some small 
– Other approaches left to future work
• Start with  = 100 for both methods
– Newton method: occasionally increased to 1000
– Direct method:  not increased in our examples
Computational Results
• When B0((BB0+P) \ (Bd)) - d = 0, optimal solution
obtained
– Tested on randomly generated linear programs
– We know the optimal objective values
– This condition is used as a stopping criterion
– Relative difference from the true objective value and
maximum constraint violation less then 1e-3, and often
smaller than 1e-6
• B0((BB0+ P) \ (Bd)) - d = 0 satisfied efficiently
– Our algorithms are compared against the commercial
LP package CPLEX 9.0 (simplex and barrier methods)
– Our algorithms are implemented using MATLAB 7.3
Running Time Versus Linear Program Size
Average seconds to solution
Problems with the Same Number of Variables and Constraints
Number of variables (= number of constraints)
Average Seconds to Solve 10 Random
Linear Programs with 100 Variables and
Increasing Numbers of Constraints
Constraints
CPLEX
Newton LP
DLE
1,000
0.1
0.1
0.1
10,000
0.3
1.0
0.5
100,000
3.0
13.0
5.5
1,000,000
44.8
173.5
70.9
Average Seconds to Solve 10 Random Linear
Programs with 100 Constraints and Increasing
Numbers of Variables
Variables
CPLEX
Newton LP
DLE
1,000
0.02
0.04
0.03
10,000
0.05
0.20
0.90
100,000
0.84
3.22
0.91
1,000,000
17.9
29.1
9.3
Conclusion
• Presented sufficient conditions for obtaining an
exact solution to a primal linear program from a
classical dual exterior penalty function
• Precise termination condition given for
– Newton algorithm for linear programming (JMLR
2006)
– Direct method based on solving the optimality
condition of the differentiable convex penalty function
• Algorithms efficiently obtain optimal solutions
using the precise termination condition
Future Work
• Deal with larger linear programs
• Application to real-world linear programs
• Direct methods for other optimization
problems, e.g. linear complementarity
problems
• Further improvements to performance and
robustness
Links to Talk & Papers
• http://www.cs.wisc.edu/~olvi
• http://www.cs.wisc.edu/~wildt
Optimality Condition for the Primal Exterior
Penalty Problem & Exact Dual LP Solution
A nasc for solving the primal penalty problem:
is:
where Q2 R` £ ` is a diagonal matrix of ones and zeros defined
as follows:
Solving for y gives: y = (B0QB) \ ( B0Qb - d)
which gives the following exact dual solution, u = (-By +  b)+,
u = (B((B0QB) \ d) - (B((B0QB) \ (B0 Qb)) - b))+
Sufficient Condition for Penalty Parameter 
• Note that in
u = (B((B0QB) \ d) - (B((B0QB) \ (B0 Qb)) - b))+
u depends on  only through
– Q, which depends on  through y and 
– The explicit dependence on  above
• Thus,  is sufficiently large to ensure u is an exact
solution of the linear program if
– Q does not change with increasing 
– diag(sign(u))(B((B0 QB) \ (B0 Qb)) - b) = 0
The subgradient with respect to 

Multiple Instance Classification

Transcript Multiple Instance Classification

Directory