msa2015 12502

Download Report

Transcript msa2015 12502

Graphical Modeling of Multiple
Sequence Alignment
Jinbo Xu
Toyota Technological Institute at Chicago
Computational Institute, The University of Chicago
Two applications of MSA
• Predict inter-residue interaction network (i.e.,
protein contact map) from MSA using joint
graphical lasso
– An important subproblem of protein folding
• Align two MSAs through alignment of two
Markov Random Fields (MRFs)
– Homology detection and fold recognition
– Merge two MSAs into a larger one
Modeling MSA
by Markov Random Fields
The generating probability of a sequence 𝑆:
1
𝑃 𝑆 =𝑍
𝑖 𝜙(𝑋𝑖 )
(𝑖,𝑘) 𝜓(𝑋𝑖 , 𝑋𝑘 )
Infer 𝜙, 𝜓 by maximum-likelihood
𝜓 encodes residue correlation relationship
A special case is Gaussian Graphical Model
Numeric Representation of MSA
Represent a sequence in MSA as a L×21 binary vector
…
0
0
…
1
0
21 elements for each column in MSA
…
Gaussian Graphical Model (GGM)
• 𝑆 𝑘 : a multiple sequence alignment (MSA)
• Assume 𝑆 𝑘 has Gaussian distribution 𝑁(𝑢𝑘 , Σ 𝑘 )
where Σ 𝑘 is the covariance matrix
• 𝛺𝑘 (inverse of Σ 𝑘 ): the precision matrix, implying the
residue interaction pattern among all MSA columns
Covariance and Precision Matrix
The precision matrix has dimension 21L×21L
one residue pair
L
21×21
Larger values indicate
stronger interaction
L
Today’s talk
• Predict inter-residue interaction network (i.e.,
protein contact map) from MSA using joint
graphical lasso
– An important subproblem of protein folding
• Align two MSAs through alignment of two
Markov Random Fields (MRFs)
– Homology detection and fold recognition
– Merge two MSAs into a larger one
Protein Contact Map
(residue interaction network)
Two residues in contact if their Cα or Cβ distance < 8Å
5.9
2
4
8.1
3.8
6.0
3
1
Shorter distance
1
2
3
4
1
0
1
1
0
2
1
0
1
1
3
1
1
0
1
4
0
1
1
0
Stronger interaction
Contact Matrix is Sparse
#contacts is linear w.r.t. sequence length
short range: 6-12 AAs apart along primary sequence
medium range: 12-24 AAs apart
long range: >24 AAs apart
Protein Contact Prediction
Input:
MEKVNFLKNGVLRLPPGFRFRPTDEELVVQYLKRKVFSFPLPASIIPEVEVYKSDPWDLPGDMEQEKYFFSTK
EVKYPNGNRSNRATNSGYWKATGIDKQIILRGRQQQQQLIGLKKTLVFYRGKSPHGCRTNWIMHEYRLAN
LESNYHPIQGNWVICRIFLKKRGNTKNKEENMTTHDEVRNREIDKNSPVVSVKMSSRDSEALASANSELKK
KASIIFYDFMGRNNSNGVAASTSSSGITDLTTTNEESDDHEESTSSFNNFTTFKRKIN
Output:
With L/12 long-range
native contacts, the fold of
a protein can be roughly
determined [Baker group]
Contact Prediction Methods
•
Evolutionary coupling analysis (unsupervised learning)




Identity co-evolved residues from multiple sequence alignment
No solved protein structures used at all
High-throughput sequencing makes this method promising
e.g., mutual information, Evfold, PSICOV, plmDCA, GREMLIN
• Supervised machine learning
 Input features: sequence (profile) similarity, chemical properties
similarity, mutual information
 (implicitly) learn information from solved structures
 examples: NNcon, SVMcon, CMAPpro, PhyCMAP
Evolutionary Coupling (EC) Analysis
Observation: two residues in contact tend to co-evolve,
i.e., two co-evolved residues likely to form a contact
http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0028766
Evolutionary Coupling (EC) Analysis (Cont’d)
•
Local statistical methods: examine the correlation
between two residues independent of the others
 Mutual information (MI): two residues in contact likely to have
large MI
 Not all residue pairs with large MI are in contact due to indirect
evolutionary coupling. If A~B and B~C, then likely A~C
•
Global statistical methods: examine the correlation
between two residues condition on the others




Need a large number of sequences
Maximum-Entropy: Evfold
Graphical lasso: PSICOV
Pseudo-likelihood: plmDCA, GREMLIN
Single MSA-based Contact Prediction
• Given a protein sequence under prediction, run PSIBLAST to detect its homologs and build an MSA
• Calculate the sample covariance matrix 𝛴 𝑘 from the MSA
• 𝛴 𝑘 is singular, so cannot calculate the precision matrix
𝛺𝑘 by (𝛴 𝑘 )−1
• Calculate 𝛺𝑘 by maximum-likelihood, i.e., maximize the
occurring probability of observed seqs
𝑘 |𝛺 𝑘
max
𝑙𝑜𝑔𝑃
𝑆
𝑘
𝛺
𝑘 − tr 𝛺𝑘 𝛴 𝑘
max
log
𝛺
𝑘
𝛺
− 𝜆1 𝛺𝑘
1
Enforce sparse precision matrix
Why ?
Issues with Existing Methods
 Evolutionary coupling (EC) analysis works for proteins
with a large number of sequence homologs
 Focus on how to improve the statistical methods instead of
use of extra biological information/insight, e.g., relax the
Gaussian assumption, consensus of a few EC methods,
 Use information mostly in a single protein family
 Physical constraints other than sparsity not used
Our Work: contact prediction using
multiple MSAs
Goal: focus on proteins without many sequence homologs
Strategy: increase statistical power by information aggregation
 Jointly predict contacts for related families of similar folds.
That is, predict contacts using multiple MSAs.
 These MSAs share inter-residue interaction network to some degree
 Integrate evolutionary coupling (EC) analysis with supervised
learning
 EC analysis makes use of residue co-evolution information
 Supervised learning makes use of sequence (profile) similarity
Observation: different protein families
share similar contact maps
Red: shared; Blue: unique to PF00116; Green: unique to PF13473
Joint evolutionary coupling (EC) analysis
Jointly predict contacts for a set of related
protein families
 Predict contacts for a protein family using information in
other related families
 Enforce contact map consistency among related families
 Do not lose family-specific information
Joint graphical lasso for joint evolutionary
coupling analysis
1. Given a protein family and its MSA, find related
families 𝑆 = {𝑆 1 , 𝑆 2 , … , 𝑆 𝐾 } and corresponding MSAs
Let 𝛺= 𝛺1 , 𝛺2 , … , 𝛺𝐾 be precision matrices
2. Estimate 𝛺 by joint log-likelihood as follows
𝐾
𝑘=1
log 𝛺𝑘 − tr 𝛺𝑘 𝛴𝑘
−
𝐾
𝑘=1 𝜆𝑘 |
𝛺𝑘 |1
Where the last term enforces sparse precision matrices
How to enforce contact map consistency?
Residue Pair/Submatrix Grouping
In total ≤L(L-1)/2 groups where L is the seq length
Enforce Contact Map Consistency
by Group Penalty
𝐺: the number of groups
𝐾: the number of families
Using group lasso to model family consistency:
max
𝐾
𝑘=1
log 𝛺𝑘 −
𝜆𝑔 is defined as 𝜆𝑔 = 𝛼 𝑁 − 1
𝑁−1
𝑁−1
𝑛=1 𝑃𝑛
Group conservation level
Supervised Machine Learning
• Input features: sequence profile, amino acid
chemical properties, mutual information
power series, context-specific statistical
potential
• Mutual information power series:
– Local info: mutual information matrix (MI)
– Partially global info: MI2, MI3, …, MI11
– Can be calculated much faster than PSICOV
• Random Forests trained by 800-900 proteins
Joint EC Analysis with Supervised
Prediction as Prior
𝐾
log 𝛺𝑘 − tr 𝛺𝑘 𝛴𝑘
max
𝑘=1
𝐾
−𝜆1
Log-likelihood of K families
sparsity
| 𝛺𝑘 |1
𝑘=1
𝐺
−
contact map consistency among families
𝜆𝑔 ||𝛺𝑔 ||2
𝑔=1
𝐾
−𝜆2
𝑘=1
𝑘
𝛺𝑖𝑗
𝑘
𝑖𝑗 𝑚𝑎𝑥(𝑃𝑖𝑗 ,
1
0.3)
similarity with supervised prediction
This optimization problem can be solved by ADMM to suboptimal
Accuracy on 98 Pfam families
Medium-range
Long-range
CoinDCA
L/10
0.496
L/5
0.435
L/2
0.312
L/10
0.561
L/5
0.502
L/2
0.391
PSICOV
PSICOV_b
plmDCA
plmDCA_h
GREMLIN
GREMLIN_h
0.375
0.388
0.433
0.433
0.401
0.391
0.312
0.306
0.354
0.339
0.332
0.316
0.213
0.199
0.233
0.211
0.225
0.204
0.446
0.462
0.484
0.480
0.447
0.428
0.400
0.400
0.443
0.413
0.423
0.400
0.311
0.294
0.343
0.292
0.329
0.301
Merge_p
Merge_m
Voting
0.303
0.276
0.405
0.246
0.223
0.280
0.178
0.169
0.168
0.370
0.355
0.337
0.328
0.309
0.353
0.253
0.232
0.275
Accuracy vs. # Sequence Homologs
(A) Medium-range
(B) Long-range
X-axis: ln of the number of non-redundant sequence homologs
Y-axis: L/10 accuracy
Accuracy on 123 CASP10 targets
Medium-range
Long-range
CoinDCA
L/10
0.500
L/5
0.440
L/2
0.340
L/10
0.412
L/5
0.351
L/2
0.279
Evfold
PSICOV
plmDCA
GREMLIN
0.294
0.310
0.344
0.343
0.249
0.259
0.289
0.280
0.188
0.192
0.214
0.229
0.257
0.276
0.326
0.320
0.225
0.225
0.280
0.278
0.171
0.168
0.213
0.159
NNcon
CMAPpro
0.393
0.414
0.334
0.363
0.226
0.276
0.239
0.336
0.188
0.297
0.001
0.227
Accuracy vs. # sequence homologs
(CASP10)
X-axis: ln of # non-redundant sequence homologs
Y-axis: L/10 long-range prediction accuracy
Accuracy vs. Contact Conservation
Level
(A)Medium-range; (B) long-range
X-axis: conservation level, the larger, the more conserved
Today’s Talk
• Predict inter-residue interaction network (i.e.,
protein contact map) from MSA using joint
graphical lasso
– An important subproblem of protein folding
• Align two MSAs through alignment of two
Markov Random Fields (MRFs)
– Remote homology detection and fold recognition
– Merge two MSAs into a larger one
Homology Detection & Fold
Recognition
• Primary sequence comparison
– Similar sequences -> very likely homologous
– Sequence alignment method, e.g., BLAST, FASTA
– works only for close homologs
• Profile-based method
– Compare two protein families instead of primary sequences, using
evolutionary information in a family
– Sequence-profile alignment & profile-profile alignment
– Profile can be represented as a matrix (e.g., FFAS) or a HMM (e.g.,
HHpred, HMMER)
– Sometimes works for remote homologs, but not sensitive enough
MSA to Sequence Profile
Two popular profile representations: (1) Position-specific
scoring matrix (PSSM); (2) Hidden Markov Model (HMM)
Position-Specific Scoring Matrix
(PSSM)
Taken from http://carrot.mcb.uconn.edu/~olgazh/bioinf2010/class10.html
Hidden Markov Model (HMM)
http://www.biopred.net/eddy.html
Our Work: Markov Random Fields
(MRF) Representation
1) MRF encodes long-range residue interaction pattern while HMM does not;
2) Long-range interaction pattern encodes global information of a protein,
So can deal with proteins of similar folds but divergent sequences
Protein alignment by aligning two MRFs
Family 1
G
G
F
K
R
R
L
L
K
K
V
V
-
Y
Y
L
L
S
S
Y
Y
A
A
I
I
P
P
P
P
T
T
T
T
A
A
V
V
K
K
P
P
F
F
G
G
R
R
Y
R
E
S
E
S
Family 2
MRF1
MRF2
Scoring function for MRF alignment
MRF1
𝑀
𝜃𝑖,𝑗
𝑀
𝑍𝑖,𝑗
=1
𝑀
𝜃𝑘,𝑙
𝑀
𝑍𝑘,𝑙
=1
NP-hard due to
1) Gaps allowed
2) Pairwise potential
MRF2
local alignment potential
pairwise alignment potential
Alternating Direction of Method
Multiplier (ADMM)
Make a copy of z to y
𝑢 𝑢
𝜃𝑖,𝑗
𝑧𝑖,𝑗
max
𝑧,𝑦
𝑖,𝑗,𝑢
1
+
𝐿
𝑢𝑣
𝑢 𝑣
𝜃𝑖,𝑗,𝑘,𝑙
𝑧𝑖,𝑗
𝑦𝑘,𝑙
𝑖,𝑗,𝑘,𝑙,𝑢,𝑣
𝑠. 𝑡. ∀𝑘, 𝑙, 𝑣,
𝑣
𝑣
𝑧𝑘,𝑙
= 𝑦𝑘,𝑙
Add a penalty term to obtain an augmented problem
𝑢 𝑠
𝜃𝑖,𝑗
𝑧𝑖,𝑗
max
z,y
𝑖,𝑗,𝑢
1
+
𝐿
𝑢𝑣
𝑢 𝑣
𝜃𝑖,𝑗,𝑘,𝑙
𝑧𝑖,𝑗
𝑦𝑘,𝑙
𝑖,𝑗,𝑘,𝑙,𝑢,𝑣
𝑠. 𝑡. ∀𝑖, 𝑗, 𝑢,
𝑢
𝑢
𝑧𝑖,𝑗
= 𝑦𝑖,𝑗
𝜌
−
2
𝑢
𝑧𝑖,𝑗
𝑖,𝑗,𝑢
𝑢 2
− 𝑦𝑖,𝑗
ADMM (Cont’d)
Use a Lagrangian multiplier 𝜆 to relax the original problem
and obtain a upper bound
min max
𝜆
𝑧,𝑦
𝑢 𝑢
𝑖,𝑗,𝑢 𝜃𝑖,𝑗 𝑧𝑖,𝑗 +
1
𝐿
𝑢𝑣
𝑢 𝑣
𝑖,𝑗,𝑘,𝑙,𝑢,𝑣 𝜃𝑖,𝑗,𝑘,𝑙 𝑧𝑖,𝑗 𝑦𝑘,𝑙 −
𝜌
2
𝑖,𝑗,𝑢
𝑢
𝑢
𝑧𝑖,𝑗
− 𝑦𝑖,𝑗
2
𝑢
𝑢
𝜆𝑢𝑖,𝑗 𝑧𝑖,𝑗
− 𝑦𝑖,𝑗
+
𝑖,𝑗,𝑢
Solve the above problem iteratively as follows:
Step 1: Solve the optimization problem for a fixed 𝜆
Step 2: Update 𝜆 by subgradient and repeat 1) until convergence
ADMM(Cont’d)
For a fixed 𝜆 , 𝑠plit the relaxation problem into
two subproblems and solve them alternatively
(SP1)
𝑣
Where 𝐶𝑘,𝑙
=
(SP2)
𝑣
𝑣
𝑘,𝑙,𝑣 𝑦𝑘,𝑙 𝐶𝑘,𝑙
𝑢𝑣
𝑢
𝑣
𝜃
𝑧
−
𝜆
𝑘,𝑙
𝑖,𝑗,𝑢 𝑖,𝑗,𝑘,𝑙 𝑖,𝑗
𝑦 ∗ = 𝑎𝑟𝑔𝑚𝑎𝑥
1
𝐿
𝑧 ∗ = 𝑎𝑟𝑔𝑚𝑎𝑥
𝑢
𝑢
Where 𝐷𝑖,𝑗
= 𝜃𝑖,𝑗
+
𝑢
𝑢
𝑖,𝑗,𝑢 𝑧𝑖,𝑗 𝐷𝑖,𝑗
1 𝑢𝑣
𝑣∗
𝑘,𝑙,𝑣 𝐿 𝜃𝑖,𝑗,𝑘,𝑙 𝑦𝑘,𝑙
−
𝜌
2
𝑣
1 − 2𝑧𝑘,𝑙
𝜌
2
𝑢∗
+ 𝜆𝑢𝑖,𝑗 − (1 − 𝑦𝑖,𝑗
)
Both subproblems can solved efficiently by dynamic programming!
Superfamily & Fold Recognition Rate
Superfamily level detection
Fold level detection
Conclusion
• Joint evolutionary coupling analysis +
supervised learning can significantly improve
protein contact prediction by using
information in multiple MSAs
• Long-range residue interaction encoded in an
MSA helpful for remote homolog detection
Acknowledgements
• RaptorX servers at http://raptorx.uchicago.edu
• Students: Jianzhu Ma, Zhiyong Wang, Sheng Wang
• Funding
– NIH R01GM0897532
– NSF CAREER award and NSF ABI
– Alfred P. Sloan Research Fellowship
• Computational resources
– University of Chicago Beagle team
– TeraGrid and Open Science Grid
Protein Structure Prediction
Input:
MEKVNFLKNGVLRLPPGFRFRPTDEELVVQYLKRKVFSFPLPASIIPEVEVYKSDPWDLPGDMEQEKYFFSTK
EVKYPNGNRSNRATNSGYWKATGIDKQIILRGRQQQQQLIGLKKTLVFYRGKSPHGCRTNWIMHEYRLAN
LESNYHPIQGNWVICRIFLKKRGNTKNKEENMTTHDEVRNREIDKNSPVVSVKMSSRDSEALASANSELKK
KASIIFYDFMGRNNSNGVAASTSSSGITDLTTTNEESDDHEESTSSFNNFTTFKRKIN
Output:
1. One of the most challenging problems
in computational biology!
2. Improved due to better algorithms
and large databases
3. Knowledge-based methods outperforms
physics-based methods
4. Big demand: our server processes
> 800 jobs/week, >12k users in 3yrs
Performance in CASP9 (2010)
A blind test for protein structure prediction
Server ranking tested on the 50 hardest TBM targets
Adapted from
http://predictioncenter.org/casp9/doc/presentations/CASP9_TBM.pdf
Performance in CASP10 (2012)
A blind test for protein structure prediction
The top 10 performing human/server groups
on the hardest TBM targets
The only server group among top 10
Adapted from
http://predictioncenter.org/casp10/doc/presentations/CASP10_TBM_GM.pdf
My Work
Analyze large-scale biological data and build predictive models
•
•
•
•
Protein sequence and structure alignment
Homology detection and fold recognition
Protein structure prediction
Protein function prediction (e.g., interaction and binding site
prediction)
• Biological network construction and analysis
Study computational methods that have applications beyond
bioinformatics
• Machine learning (e.g. probabilistic graphical model)
• Optimization (discrete, combinatorial and continuous)
Homology Detection & Fold Recognition
Two proteins are homologous if they have shared ancestry.
Two proteins have the same fold if their 3D structures are similar.
• Homology detection & fold recognition
– Determine the relationship between two proteins
– Given a query, search for all homologs in a database
• Homology search/fold recognition useful for
– Study protein evolutionary relationship
– Functional transfer
– Homology modeling (i.e., template-based modeling)
Structure Prediction (Cont’d)
• Template-based modeling (TBM)
– Using solved protein structures as template, e.g.,
homology modeling and protein threading
– Most reliable, but fails when no good templates
• Template-free modeling (FM) or ab initio folding
– Not using solved protein structures as template
– Mostly works only on some small proteins
• Subproblems
– Loop modeling
– Inter-residue contact prediction
Residue Pair Grouping
Precision Submatrix Grouping
Suppose that residue pair (2,4) in Family 1 aligned to pair (3,5) in Family 2
𝛺1 for Family 1
𝛺2 for Family 2
In total ≤L(L-1)/2 groups where L is the seq length
Performance on the 31 Pfam families with
only distantly-related auxiliary families
Medium-range
L/10
L/5
CoinFold
0.457
0.400
PSICOV
0.413
0.360
PSICOV_p
0.320
0.295
PSICOV_v
0.400
0.320
L/2
0.267
0.252
0.212
0.179
Long-range
L/10
0.558
0.494
0.396
0.396
L/5
0.524
0.465
0.355
0.375
L/2
0.416
0.377
0.290
0.261
CoinFold: our work
PSICOV: single-family method
PSICOV_p: merge multiple families and apply single-family method
PSICOV_v: single-family method for each family and then consensus
Performance on the 13 Pfam families with
closely-related auxiliary families
Medium-range
L/10
L/5
CoinFold 0.501
0.395
PSICOV
0.433
0.351
PSICOV_p 0.335
0.220
PSICOV_v 0.423
0.320
L/2
0.251
0.231
0.175
0.188
Long-range
L/10
0.462
0.398
0.322
0.386
L/5
0.413
0.331
0.276
0.384
L/2
0.293
0.234
0.194
0.301
CoinFold: our work
PSICOV: single-family method
PSICOV_p: merge multiple families and apply single-family method
PSICOV_v: single-family method for each family and then consensus
Our method vs. PSICOV
Our method vs. GREMLIN
L/10 top predicted long-range contacts are evaluated
Performance vs. family size
CoinFold: our work
PSICOV: single-family method
PSICOV_p: merge multiple families and apply single-family method
PSICOV_v: single-family method for each family and then consensus
Multiple Sequence Alignment (MSA) of
One Protein Family
Top L/10 long-range prediction
accuracy on 15 large Pfam families
PFAM ID
MEFF
CoinFold
PSICOV
PF00041
PF00595
PF03061
PF01522
PF00578
PF00059
PF07686
PF00034
PF00989
PF00144
PF00085
PF00168
PF00515
PF00089
PF00550
2981
3026
3334
3519
3733
3744
3801
4060
4596
4684
5075
6735
7230
9045
11476
0.767
0.556
0.375
0.615
0.308
0.455
0.917
0.600
0.583
0.272
0.636
0.667
0.500
0.783
0.857
0.767
0.444
0.250
0.462
0.308
0.455
0.583
0.500
0.250
0.212
0.545
0.667
0.500
0.783
0.857
PSICOV_
p
0.667
0.233
0.500
0.077
0.231
0.182
0.667
0.200
0.500
0.242
0.182
0.556
0.250
0.739
0.714
Running Time
Time (in seconds)
Average protein sequence length
Performance: Alignment Accuracy
Performance: Homology Detection
Performance: Alignment Accuracy
Tmalign, Matt and DeepAlign represent three different ground truth
Joint Graphical Lasso Formulation
Rewrite the original problem as
max 𝑓 Ω − 𝑃(Ω)
Ω
Log-likelihood: 𝑓 Ω =
Regularization: 𝑃 Ω =
𝐾
𝑘
𝑘 𝑘
𝑘=1 log 𝛺 − tr 𝛺 𝛴
𝑘 | + 𝐺
𝜆1 𝐾
|
𝛺
1
𝑔=1 𝜆𝑔 ||𝛺𝑔 ||2
𝑘=1
 Unconstrained optimization problem
 Both 𝑓 and 𝑃 are convex, so the objective is
the difference of two convex functions
 Can be solved by the Convex-Concave Procedure [A. Yuille]
Alternating Direction of Method
Multiplier (ADMM)
Make a copy of Ω to 𝑍, without changing the solution space
max 𝑓 Ω − 𝑃 𝑍
Ω,𝑍
s.t. Ω =Z
Add a penalty term to obtain an augmented problem,
which has the same solution but converges faster.
𝜌
max 𝑓 Ω − 𝑃 𝑍 −
Ω,𝑍
2
s.t. Ω =Z
𝐾
𝑘
𝛺 −𝑍
𝑘=1
𝑘
2
𝐹
Lagrangian Relaxation
Use a Lagrange multiplier 𝑈 𝑘 for each constraint 𝛺𝑘 = 𝑍 𝑘
Obtain a dual problem to upper bound the augmented problem
𝐾
min max 𝑓 Ω − 𝑃 𝑍 −
𝑈
Ω,𝑍
𝑘 𝑇
𝑘
(𝜌(𝑈 ) 𝛺 − 𝑍
𝑘=1
𝑘
𝜌
+
𝛺𝑘 − 𝑍 𝑘
2
2
𝐹
)
Solve the dual problem iteratively by subgradient as follows.
Step 1) fix 𝑈 and solve
max 𝑓 Ω − 𝑃 𝑍 −
Ω,𝑍
𝜌
2
𝐾
𝑘
𝑘=1 ||𝛺
− 𝑍 𝑘 + 𝑈 𝑘 ||2𝐹
Step 2) Update 𝑈 by 𝑈 + 𝜌(Ω − 𝑍) and repeat 1) until
convergence
ADMM (Cont’d)
For a fixed U, split the relaxation problem into two
subproblems and solve them alternatively
(SP1)
Ω∗ = 𝑎𝑟𝑔𝑚𝑎𝑥 {𝑓 Ω −
(SP2)
𝑍∗
=
𝜌
𝑎𝑟𝑔𝑚𝑖𝑛{
2
𝐾
𝑘=1
𝜌
2
𝛺𝑘
𝐾
𝑘
𝑘=1 ||𝛺
−
𝑍𝑘
+
− 𝑍 𝑘 + 𝑈 𝑘 ||2𝐹 }
𝑈𝑘
2
𝐹
+ 𝑃(𝑍)}