Computer Mountain Climbing - School of Mathematics

Download Report

Transcript Computer Mountain Climbing - School of Mathematics

Optimization with
Extreme*
Big Data = Mountain Climbing
* in a billion dimensional space on a foggy day
Peter Richtarik
School of Mathematics
BIG DATA
BIG Volume
BIG Velocity
BIG Variety
Sources
•
•
•
•
•
•
•
digital images & videos
transaction records
government records
health records
defence
internet activity (social media, wikipedia, ...)
scientific measurements (physics, climate models, ...)
Arup
(Truss Topology Design)
Western General Hospital
(Creutzfeldt-Jakob Disease)
Royal Observatory
(Optimal Planet Growth)
Ministry of Defence
dstl lab
(Algorithms for Data Simplicity)
GOD’S Algorithm = Teleportation
If you are not a God...
x0
x2 x
3
x1
Optimization as
Lock Breaking
A number
representing the
“quality” of a
combination
x = (x1, x2, x3, x4)
F(x) = F(x1, x2, x3, x4)
Setup: Combination maximizing F opens the lock
Optimization Problem: Find combination maximizing F
Optimization Algorithm
How to Open a Lock with Billion
Interconnected Dials?
# variables/dials = n = 109
R
Assumption:
F = F1 + F2 + ... + Fn
----------------------Fj depends on the
x4
x1
F : Rn
x3
neighbours of xj only
x2
xn
Example:
F1 depends on x1, x2, x3 and x4
F2 depends on x1 and x2, ...
Optimization Methods
Computing
Architectures
• Multicore CPUs
• GP GPU accelerators
• Clusters / Clouds
•
•
•
•
•
•
•
Effectivity
Efficiency
Scalability
Parallelism
Distribution
Asynchronicity
Randomization
Optimization Methods for Big Data
• Randomized Coordinate Descent
– P. R. and M. Takac: Parallel coordinate descent methods for big
data optimization, ArXiv:1212.0873
[can solve a problem with 1 billion variables in 2 hours
using 24 processors]
• Stochastic (Sub) Gradient Descent
– P. R. and M. Takac: Randomized lock-free methods for
minimizing partially separable convex functions
[can be applied to optimize an unknown function]
• Both of the above
M. Takac, A. Bijral, P. R. and N. Srebro: Mini-batch primal and
dual methods for SVMs, ArXiv:1302.xxxx
Theory vs Reality
Parallel Coordinate Descent
Probability
HPC
Matrix Theory
TOOLS
Machine Learning