Mathematical Programming in Support Vector Machines

Download Report

Transcript Mathematical Programming in Support Vector Machines

A Newton Method for Linear Programming
Olvi L. Mangasarian
University of California at San Diego
Outline
Fast Newton method for class of linear programs
Large number of constraints (millions)
Moderate number of variables (hundreds)
 Method is based on an overlooked fact
Dual of asymptotic exterior penalty problem of primal LP
Provides an exact least 2-norm solution of the dual LP
Use a finite value of the penalty parameter
Exact least 2-norm dual solution generates:
Highly accurate primal solution when it is unique
Globally convergent Newton method established
Eleven lines of MATLAB code
Method solves LP with 2-million constraints & 100 variables
14-place accuracy in 17.4 minutes on 400Mhz machine
CPLEX 6.5 ran out of memory on same problem
Dual method for million-variable LP
Primal-Dual LPs
Primal Exterior Penalty & Dual Least 2-Norm Problem
Primal LP:
Dual LP:
Asymptotic Primal Exterior Penalty:
Exact Dual Least 2-Norm:
The Plus Function
Karush-Kuhn-Tucker Optimality Conditions
for Least 2-Norm Dual
Dual Least 2-Norm:
Lagrangian:
KKT Conditions:
Equivalently:
Karush-Kuhn-Tucker Optimality Conditions
for Least 2-Norm Dual (Continued)
We have:
Thus :
Hence:
Thus:
solves the exterior penalty problem for the primal LP
Equivalence of Exact Least 2-Norm Dual Solution
to Finite-Parameter Primal Penalty Minimization
Primal Exterior Penalty
Primal Exterior Penalty(exact least 2-norm dual solution) :
Gradient:
Generalized Hessian:
where:
LP Newton Algorithm (LPN)
LPN Convergence
Remarks on LPN
lpgen: Random Solvable LP Generator
%lpgen: Generate random solvable lp: min c'x s.t. Ax =< b; A:m-by-n
%Input: m,n,d(ensity); Output: A,b,c; (x,u): primal-dual solution
pl=inline('(abs(x)+x)/2');%pl(us) function
tic;A=sprand(m,n,d);A=100*(A-0.5*spones(A));
u=sparse(10*pl(rand(m,1)-(m-3*n)/m));
x=10*spdiags((sign(pl(rand(n,1)-rand(n,1)))),0,n,n)*(rand(n,1)-rand(n,1));
c=-A'*u;b=A*x+spdiags((ones(m,1)-sign(pl(u))),0,m,m)*10*ones(m,1);toc0=toc;
format short e;[m n d toc0]
Elements of A uniformly distributed between –50 and +50
Primal random solution x with elements in [-10,10],
approximately half of which are zero
Dual random solution u in [0,10], approximately 3n of
which are positive
lpnewt1: MATLAB LPN Algorithm
without Armijo
%lpnewt1: Solve primal LP: min c'x s.t. Ax=<b
%Via Newton for least 2-norm dual LP: max -b'v s.t. -A'v=c, v>=0
%Input: c,A,b,epsi,delta,tol,itmax;Output:v(l2norm dual sol),z primal sol
epsi=1e-3;tol=1e-12;delta=1e-4;itmax=100;%default inputs
pl=inline('(abs(x)+x)/2');%pl(us) function
tic;i=0;z=0;y=((A(1:n,:))'*A(1:n,:)+epsi*eye(n))\(A(1:n,:))'*b(1:n);%y0
while (i<itmax & norm(y-z,inf)>tol & toc<1800)
df=A'*pl((A*y-b))+epsi*c;
d2f=A'*spdiags(sign(pl(A*y-b)),0,m,m)*A+delta*speye(n);
z=y;y=y-d2f\df;
i=i+1;
end
toc1=toc;v=pl(A*y-b)/epsi;t=find(v);z=A(t,:)\b(t);toc2=toc;
format short e;[epsi delta tol i-1 toc1 toc2 norm(x-y,inf) norm(x-z,inf)]
LPN & CPLEX 6.5 Comparison
(LPN without Armijo vs. Dual Simplex)
(400 Mhz Pentium II 2Gig) (oom=out of memory)
What About the Case of: n>> m ?
LPN is not appropriate
LPN needs to invert a very large n-by-n matrix
Use instead DLPN (Dual LPN):
Find least 2-norm primal solution
 By solving dual exterior penalty problem with finite
penalty parameter
Equivalence of Exact Least 2-Norm Primal Probelm
and Asymptotic Dual Penalty Problem
Difficulty: Nonnegativity Constrained Dual Penalty
 Use an exterior penalty to handle nonnegativity constraint
Replace:
By:
DLPN & CPLEX 7.5 Comparison
(DLPN without Armijo vs. Primal Simplex )
(1.8 MHz Pentium 4 1Gig)(oom=out of memory;error 1001)
(infs/unbd=presolve determines infeasibility/unboundedness;error 1101)
Conclusion
LPN & DLPN: Fast new methods for solving linear
programs
LPN capable of solving problems with millions of
constraints and hundreds of variables
DLPN capable of solving problems with hundreds of
constraints and millions of variables
 Eleven lines of MATLAB code for each of LPN & DLPN
 Competitive with state-of-the-art CPLEX
Very suitable for classification problems
Armijo stepsize not needed in many applications
Typical termination in 3 to 30 steps
Future Work
Investigate & establish finite termination for LPN
Extend to convex quadratic programming
Establish convergence/termination under no assumptions for
LPN
Generate self-contained codes for classification and datamining problems
Paper & MATLAB Code Available
www.cs.wisc.edu/~olvi