Juan Andrés Bazerque, Gonzalo Mateos and Georgios B. Giannakis

Download Report

Transcript Juan Andrés Bazerque, Gonzalo Mateos and Georgios B. Giannakis

Distributed Lasso for In-Network
Linear Regression
Juan Andrés Bazerque, Gonzalo Mateos
and Georgios B. Giannakis
March 16, 2010
Acknowledgements: ARL/CTA grant DAAD19-01-2-0011,
NSF grants CCF-0830480 and ECCS-0824007
Distributed sparse estimation
Data
acquired by J agents
agent j
Linear model with sparse common parameter
(P1)
Zou, H. “The Adaptive Lasso and its Oracle Properties,” Journal of the American Statistical Association,
101(476), 1418-1429, 2006.
2
Network structure
(P1)
Decentralized
Centralized
Fusion
center
Ad-hoc
Scalability
Robustness
Lack of infrastructure
Problem statement
Given data yj and regression matrices Xj available locally at agents j=1,…,J
solve (P1) with local communications among neighbors (in-network processing)
3
Motivating application
Scenario: Wireless Communications
Spectrum cartography
Frequency (Mhz)
Goal:
Find PSD map
space
across
and frequency
Specification: coarse approx. suffices
Approach: basis expansion of
J.-A. Bazerque, and G. B. Giannakis, “Distributed Spectrum Sensing for Cognitive Radio Networks by
Exploiting Sparsity,” IEEE Transactions on Signal Processing, vol. 58, no. 3, pp. 1847-1862, March 2010. 4
Modeling
Sources
Sensing radios
Frequency bases
Sensed frequencies
Sparsity present in space and frequency
5
Space-frequency basis expansion
Superimposed Tx spectra measured at Rj
Average path-loss
Frequency bases
Linear model in
6
Consensus-based optimization
(P1)
Consider local copies
and enforce consensus
Introduce auxiliary variables
for decomposition
(P2)
(P1) equivalent to (P2)
distributed implementation
7
Towards closed-form iterates
Introduce additional variables
(P3)
Idea: reduce to orthogonal problem
8
Alternating-direction method of multipliers
Augmented Lagrangian
vars
,
,
multipliers
,
,
st step: minimize w.r.t.
AD-MoM
AD-MoM 1
2st
minimize w.r.t.
st step:
AD-MoM
3
step:
w.r.t.
AD-MoM 4st step: minimize
update multipliers
D. P. Bertsekas and J. N. Tsitsiklis, Parallel and Distributed Computation: Numerical Methods,
2nd ed. Athena-Scientific, 1999.
9
D-Lasso algorithm
Agent j initializes
and locally runs
FOR k = 1,2,…
Exchange
with agents in
Update
END FOR
offline, inversion NjxNj
10
D-Lasso: Convergence
Proposition
For every
, local estimates generated by D-Lasso satisfy
where
(P1)
Attractive features
Consensus achieved across the network
Affordable communication of sparse
with neighbors
Network-wide data
percolates through
exchanges
Distributed numerical operation
11
Power spectrum cartography
5 sources
Ns=121 candidate locations, J=50 sensing radios, p=969
iteration
Error evolution
Aggregate spectrum map
Convergence to centralized counterpart
D-Lasso localizes all sources through variable selection
12
Conclusions and future directions
Sparse linear model with distributed data
Lasso estimator
Ad-hoc network topology
D-Lasso
Guaranteed convergence for any constant step-size
Linear operations per iteration
Application: Spectrum cartography
Map of interference across space and time
Multi-source localization as a byproduct
Future directions
Online distributed version
Asynchronous updates
Thank You!
D. Angelosante, J.-A. Bazerque, and G. B. Giannakis, “Online Adaptive Estimation of Sparse Signals:
Where RLS meets the 11-norm,” IEEE Transactions on Signal Processing, vol. 58, 2010 (to appear). 13
Leave-one-agent-out cross-validation
Agent j is set aside in round robin fashion
agents
estimate
compute
repeat for λ= λ1,…, λN and select λmin to minimize the error
c-v error vs λ
path of solutions
Requires sample mean to be computed in distributed fashion
14
Test case: prostate cancer antigen
67 patients organized into J = 7 groups
measures the level of antigen for patient n in group j
p = 8 factors: lcavol, lweight, age, lbph, svi, lcp, gleason, pgg45
Rows of
store factors measured in patients
Lasso
D-Lasso
Centralized and distributed solutions coincide
Volume of cancer affects predominantly the level of antigen
15
Distributed elastic net
Quadratic term regularizes the solution; centralized in [Zou-Zhang’09]
Ridge regression
Elastic net
Elastic net achieves variable selection on ill-conditioned problems
H. Zou and H.H. Zhang, “On The Adaptive Elastic-Net With A Diverging Number of Parameters,"
Annals of Statistics, vol. 37, no. 4, pp. 1733-1751 2009.
16