Basic principles of probability theory

Transcript Basic principles of probability theory

Procrustes analysis
•
•
•
Purpose of procrustes analysis
Algorithm
Various modifications
Purpose of procrustes analysis
There are many situations when slightly different techniques produce different configurations.
For example when metric and non-metric scaling are used then different configurations
may be generated. Even if metric scaling is used different proximity (dissimilarity)
matrices can produce different configurations. Since this techniques are used for the
same multivariate observations each observation in one configuration corresponds to
exactly one observation in another configuration. Most of the techniques produce
configuration that rotationally is not defined.
Scores in factor analysis can also be considered as one of the possible configurations.
There are other situations when comparison of configurations is needed. For example in
macromolecular biology 3-dimensional structures of different proteins are derived. One
of the interesting question is if two different proteins are similar if they are what is
similarity between them. To find similarity it is necessary to match configurations of
two protein structures.
All these questions can be addressed using procrustes analysis.
Suppose we have two configurations X=(x1,x2,,,xn) and Y = (y1,y2,,,yn). where each x and y are
vectors in p dimensional space. We want to find a orthogonal matrix A and b vector b so
that:
n
n
p
M 2  || xi  Ayi  b ||  ( xij  ( Ayi  b) j )2  min
i 1
i 1 j 1
Prcucrustes analysis: vector and matrix
It can be show that finding translation (b) and rotation matrix (A) can be considered separately.
Translation can easily be found if we centre each configuration. If rotation is already
known then we can find translation. Let us denote zi=Ayi+b. Then we can write:
n
n
n
|| x  z || || x  x || || z  z || n || x  z ||
i 1
i
i
i 1
i
i 1
i
It is minimised when centres of x and z coincide. I.e.
b xz
We want centroids of the configuration to match. It can be done if we will subtract from x and y
their centroids respectively. Remaining problems is finding the orthogonal matrix (matrix
of rotation or inverse). We can write:
n
M 2  || xi  Ayi || tr( X  YA)( X  YA)T )  tr( XX T )  tr(YAAT Y )  2tr( X T YA)  tr( X X T )  tr(YY T )  2tr( X TYA)
i 1
Here we used the fact that under trace matrix can commute and A is the orthogonal matrix:
AAT  I
Then we want to want to perform constrained maximisation:
tr( X T YA)
 max
AAT  I
We can do using Lagrange’s multipliers technique.
Rotation matrix using SVD
Let us define symmetric matrix of constraints by 1/2. Then we want to maximise:
1
V  tr ( X T YA   ( AAT  I ))  max
2
If we get derivatives of this expression wrt to matrix A and equate them to 0 then we can get:
Y T X  A
Here we used the following facts:
p
n
tr(BA)   b jiaij ,
j 1 i 1
(tr( BA))
tr(BA))
 bqp 
 BT
a pq
A
and remembering that matrix of constraints is symmetric.
p
p
p
tr(AA )   mi  aik amk
T
m1 i 1
k 1
p
tr(AAT ) p
tr(AAT )

  pi aiq   mp amq 
 A
a pq

A
i 1
m1
We have necessary linear equations to find the required orthogonal matrix. Let us use SVD of
YTX:
Y T X  UDV T
V and U are pxp orthogonal matrices. D is the diagonal matrix of the singular values.
Rotation matrix and SVD
If we use the fact that A is orthogonal then we can write:
Y T X  A  Y T XX TY  AAT   (UDV T )(VDU T )  2  UD2U T  2    UDUT
and
Y T X  A  UDV T  UDUT A  UV T  A
It gives the solution for the rotation (orthogonal) matrix. Now we can calculate least-squares
differences between configurations:
M02  tr( XX T )  tr(YY T )  2tr( X TYA)  tr( XX T )  tr(YY T )  2tr(VDU TUV T )  tr( XX T )  tr(YY T )  2tr(VDV T )
Thus we have expressions for rotation matrix and differences between configurations after
matching. It is interesting to note that to find differences between configurations it is not
necessary rotate one of them. This expression can also be written:
M02  tr( XX T )  tr(YY T )  2tr( D)
One more useful expression is:
M02  tr( X T X )  tr(Y TY )  2tr(( X TYY T X )1/ 2 )
This expression shows that it is even not necessary to do SVD to find differences between
configurations. (For square root of matrix Cholesky decomposition could be used)
Some modifications
There are some situations where problems can occur:
1)
Dimensions of configurations can be different. There are two ways of handling this
problem. First way is to fill low dimensional (k) space with 0-s and make it high (p)
dimensional. This way we assume that first k dimensions coincide. Here we assume that kdimensional configuration is in the k-dimensional subspace of p-dimensional space.
Second way is to collapse high dimensional configuration to low dimensional space. For
this we need to project p-dimensional configuration to k-dimensional space.
2)
Second problem is when the scales of the configurations are different. In this case we can
add scale factor to the function we want to minimise:
zi  cAyi  b  M  tr( XX T )  c2tr(YY T )  2ctr( X TYA)
If we find orthogonal matrix as before then we can find expression for the scale factor:
c  tr(( X TYY T X )1/ 2 ) / tr(YY T )
As a result M is no longer symmetric wrt X and Y.
3) Sometime it is necessary to weight some variables down and other up. In this case procrustes
analysis can be performed using weights. We want to minimise the function:
M 2  tr(W ( X  AY )( X  AY )T )
This modification can be taken into account. Analysis becomes easy when weight matrix is
diagonal.

Basic principles of probability theory

Transcript Basic principles of probability theory

Directory