Basic principles of probability theory

Transcript Basic principles of probability theory

Procrustes analysis
•
•
•
•
Purpose of procrustes analysis
Algorithm
R code
Various modifications
Purpose of procrustes analysis
In general, slightly different dimension reduction techniques produce different
configurations. For example when metric and non-metric scaling are used then
different configurations may be generated. Moreover metric scaling with
different proximity matrices can produce different configurations. Most of the
techniques produce configuration that rotationally undefined. Since these
techniques are used for the same multivariate observations each observation in
one configuration corresponds to exactly one observation in another one. In
these cases it is interesting to compare the results with each other and if the
original data matrix is available then to compare with them also.
There are other situations when comparison of configurations is needed. For example
in macromolecular biology 3-dimensional structures of different proteins are
derived. One of the interesting question is if two proteins are similar. If they are
what is the similarity between them. That is the problem of configuration
matching.
All these questions can be addressed using procrustes analysis. Procrustes analysis
finds the best match between two configurations, necessary rotation matrix and
translation vector for the match and distances between configurations.
Procrustes analysis: problem
Suppose we have two configurations (data matrices) X=(x1,x2,,,xn) and Y = (y1,y2,,,yn).
where x-s and y-s are vectors in p dimensional space. We want to find an orthogonal
matrix A and a vector b so that:
n
n
p
M  || x i  Ayi  b ||  (x ij  (Ayi  b) j )2  min
2
2
i1
i1 j1
It can be shown that finding translation (b) and rotation matrix (A) can be considered
separately. Translation can easily be found if we centre each configuration. If the

rotation
is already known then we can find translation. Let us denote zi=Ayi+b. Then
we can write:
n
|| x
i1
n
i
n
 zi ||  || x i  x || || zi  z ||2 n || x  z ||2
2
2
i1
i1
Since only the third term depend on the translation vector this function is minimised when
the 
third term is equal 0. Thus:
b  x  Ay
The first step in procrustes analysis is to centre matrices X and Y and remember the mean
values of the columns of X and Y.

Prcucrustes analysis: matrix
Once we have subtracted from each column their corresponding mean, the remaining problems is
ton find the orthogonal matrix (matrix of rotation or inverse). We can write:
M 2  || xi  Ayi || tr( X  YA)( X  YA)T )  tr( XX T )  tr(YAAT Y )  2tr( X T YA)  tr( X X T )  tr(YY T )  2tr( X TYA)
i 1
Here we used the fact that under trace operator circular permutation of matrices is valid and A is
an orthogonal matrix:
AAT  I
Since in the expression of M2 only the last term is dependent on A, the problem reduces to
constrained maximisation:
tr( X T YA)
 max
AAT  I
It can be done using Lagrange’s multipliers technique.
Rotation matrix using SVD
Let us define a symmetric matrix of constraints by 1/2. Then we want to maximise:
1
V  tr ( X T YA   ( AAT  I ))  max
2
If we get the first derivatives of this expression wrt to matrix A and equate them to 0 then we get:
(1)
Y T X  A
Here we used the following facts:
p
n
tr(BA) b ji
aij ,
(tr(BA))
 b ji
aij

tr(BA)) T
B
A
j1 i1
and the fact that the matrix
of the constraints is symmetric.
p
p
p
tr(AA )   mi  aik amk
T
m1 i1
k1
tr(AAT )
tr(AAT )

  ik akj   mi amj 
 2A
aij
A
k1
m1
p
p
Thus we have necessary linear equations to find the required orthogonal matrix. To solve the
equation (1) let us use SVD of YTX:
Y T X  UDV T
V and U are pxp orthogonal matrices. D is the diagonal matrix of the singular values.
Rotation matrix and SVD
If we use the fact that A is orthogonal then we can write:
Y T X  A  Y T XX TY  AAT   (UDV T )(VDU T )  2  UD2U T  2    UDUT
and
Y T X  A  UDV T  UDUT A  UV T  A
It gives the solution for the rotation (orthogonal) matrix. Now we can calculate leastsquares distance between two configurations:
M02  tr(XXT )  tr(YY T )  2tr(X T YA)  tr(XXT )  tr(YY T )  2tr(VDU T UV T )  tr(XXT )  tr(YY T )  2tr(VDV T )
tr(XXT )  tr(YY T )  2tr(D)
Thus we have the expressions for rotation matrix and distances between
configurations after matching. It is interesting to note that to find the distance
between configurations it is not necessary to rotate one of them.
One more useful expression is:
M02  tr( X T X )  tr(Y TY )  2tr(( X TYY T X )1/ 2 )
This expression shows that it is even not necessary to do SVD to find distance
between configurations.
Algorithm
Problem: Given two configuration matrices X and Y with the same dimensions,
find rotation and translation that would bring Y as close as possible to X.
1)
Find the centroids (mean values of columns) of X and Y. Call them
xmean and ymean.
2)
Remove from each column corresponding mean. Call the new matrices
Xn and Yn
3)
Find (Yn)T(Xn). Find the SVD of this matrix (Yn)T(Xn) = UDVT
4)
Find the rotation matrix A = UVT. That is the rotation matrix
5)
Find the translation vector b = xmean - A ymean. It is the required
translation vector
6)
Find the distance between configurations:
d2=tr((Xn)T(Xn))+tr((Yn)T(Yn))-2tr(D). That is the square of the required
distance
R code
procrustes = function(X,Y){
# Simple procrustes analysis. rmmean and tr are other functions needed
#
x1 = rmmean(X)
y1 = rmmean(Y)
Xn = x1$matr
Yn = y1$matr
xmean = x1$mean
ymean = y1$mean
rm(x1);rm(y1)
s = svd(crossprod(Yn,Xn))
A = s$u%*%t(s$v)
d=sqrt(tr(crossprod(Xn,Xn)+crossprod(Yn,Yn))-2*sum(s$d)
b = xmean-A*ymean
list(matr=A,trans=b,dist=d)
}
Some modifications
There are some situations where straightforward use of procrustes analysis may not be
appropriate:
1)
Dimensions of configurations can be different. There are two ways of handling this
problem. The first way is to fill low dimensional (k) space with 0-s and make it high (p)
dimensional. This way we assume that first k dimensions coincide. Here we assume that kdimensional configuration is in the k-dimensional subspace of p-dimensional space.
Second way is to collapse high dimensional configuration to low dimensional space. For
this we need to project p-dimensional configuration to k-dimensional space.
2)
Second problem is when the scales of the configurations are different. In this case we can
add scale factor to the function we minimise:
zi  cAyi  b  M  tr( XX T )  c2tr(YY T )  2ctr( X TYA)
If we find orthogonal matrix as before then we can find expression for the scale factor:
c  tr(( X TYY T X )1/ 2 ) / tr(YY T )
As a result distance between configuration M is no longer symmetric wrt X and Y.
3) Sometimes it is necessary to weight some variables down and others up. In these cases
procrustes analysis can be performed using weights. We want to minimise the function:
M 2  tr(W ( X  AY )( X  AY )T )
This modification can be taken into account if we find SVD of YTWX instead of YTX

Basic principles of probability theory

Transcript Basic principles of probability theory

Directory