Medical Image Registration: Concepts and Implementation

Download Report

Transcript Medical Image Registration: Concepts and Implementation

Medical Image Registration:
Concepts and Implementation
Feb 28, 2006
Jen Mercer
Registration

Spatial transform that maps points from
one image to corresponding points in
another image
Registration Criteria

Landmark-based
– Features selected by the user

Segmentation-based
– Rigidly or deformably align binary
structures

Intensity-based
– Minimize intensity difference over entire
image
Spatial Transformation

Rigid
– Rotations and translations

Affine
– Also, skew and scaling

Deformable
– Free-form mapping
Registration Framework
Transforms
 x' cos
x'     
 y'  sin 
 sin    x  t x 
    

cos   y  t y 
x’=T(x|p)=T(x,y|tx,ty,θ)
 Goal: Find parameter values that
optimize image similarity metric

Optimizer

Often require derivative of image
similarity metric (S)
S (p | F , M , )
S (p | F , M , ) x' j


pi
x' j
p j
j
 x'1
 p
 1
 x'2
J   p1
 
 x'
 n
 p1
x'1
p2
x'2
p2

x'1
p1
x'1 
pm 

x'2 

pm 

 
x'n 


pm 

Jacobian and Image Gradient
Identity Transform
Maps every point to itself
 Only used for testing
 Fixed set (C): set of points that remain
unchanged by transform

Translation Transform

Fixed set is an empty set
Scaling Transform
Isotropic vs. anisotropic
 Fixed set is the origin of the coordinates

Scaling and Translation
 x'1 
 x1  C1   C1 
 x1 
 C1 
 x' 
 x  C  C 
x 
C 
2
2
2 
2
2
2





x' 
D

D
 (1  D)
  
     
 
  
 

  
 
 
 x' N 
 xN  C N  C N 
 xN 
C N 
 x1   T1 
 x  T 
2
2


D

   
   
 x N  TN 
Rotation Transform

Fixed set is the origin
 x' cos
x'     
 y'  sin 
cos

 sin 
 sin    x  C x  C x 

 


cos   y  C y  C y 
 sin    x  1  cos
sin   Cx 
   
C 


cos   y    sin  1  cos   y 
cos

 sin 
 sin    x  Tx 
    

cos   y  Ty 
Rotations in Polar Coordinates
 x' cos
x'     
 y'  sin 
 sin    x 
 

cos   y 
i
( x, y )  re  (r cos , ir sin  )
( x' , y ' )  re
i (  )
i i
 re e
Optimization
Search for value of θ that minimizes cost
function S
 Gradient descent algorithm

– Update of parameter
S
 '  


e
i '
i
e e
i
S


i
 e exp(G )

– G is the variation from the gradient of the cost
function
–  is step length of algorithm
Combined Scaling and Rotation
M
M
GD
i
D


D=scaling factor
 M=cost function
 Apply transform to a point as:
i (  )
( x' , y' )  Dre
i
i
 re  De
Add Translation
Find fixed point of transformation
 Translation (d) is result of scaling and
rotation

Scaling, Rotation,Translation
i
P'  T ( P)  ( P  C) De  C







P=arbitrary point
C=fixed point of transformation
D=scaling factor
Θ=rotation angle
P and C are complex numbers (x+iy) or reiθ
Store derivates of P in Jacobian matrix for
optimizer
Rigid if D=1, otherwise similarity transform
Affine Transformation
Collinearity is preserved
 x’=A x + T
 A is a complex matrix of coefficients
 With fixed point

– x’=A (x–C) + C

A is optimized similar to the scaling
factor
Quaternions

Quotient of two vectors
– Q= A / B

Operator that produces second vector
– A= Q  B

Represents orientation of one vector
with respect to another, as well as ratio
of their magnitudes
– Versor-rotates vector
– Tensor-changes vector magnitude
Scalars and Versors

Quaternion represented by 4 numbers
– Versor
• Direction – parallel to axis of rotation
• Rotation angle
• Norm – function of rotation angle
– Tensor
• Magnitude
Unit Sphere Versor Representation
Versor Composition

Versor definition (vector quotient)
– VCB = B / C
– VBA = A / B
– VCA = A / C

Versor composition
– VCA = VBA ◊ VCB
– Not communative
Versor Addition
Optimization of Versors

Versor
exponentiation
–
–
–
–

V2 = V ◊ V
V = V1/2 ◊ V1/2
Θ(V) = θ
Θ(Vn) = nθ
Versor Increment
 S (V ) 
dV  


V



Rigid Transform in 3D





Use quaternions instead of phasors
P’=V*(P-C)+C
P’=V*P+T, T=C-V*C
P=point, V=Versor, T=Translation, C=fixed
point
Transform represented by 6 parameters
– Three numbers representing versor
– Three components of fixed coordinate system
Numerical Representation of a
Versor

Right versor
Numerical Representation of a
Versor
-i = k ◊ j
 -j = i ◊ k
 -k = j ◊ i
 Set of elementary quaternions
= [i,j,k]= [eiπ/2 , ejπ/2, ekπ/2]

Numerical Representation of a
Versor

Any right versor can be represented as
– v=xi+yj+zk
– x2+y2+z2=1

Any generic versor can be represented
in terms of the right versor parallel to its
axis and the rotation angle as
– V=evθ
Similarity Transform in 3-D
Replace versor of rigid transform with
quaternion to represent rotation and
scale changes
 x’=Q*(x-C)+C
 x’=Q*x+T, T=C-Q*C

Image Interpolators

2 functions
– Compute interpolated intensity at
requested position
– Detect whether or not requested position
lies within moving-image domain
Nearest Neighbor
Uses intensity of nearest grid position
 Computationally cheap
 Doesn’t require floating point
calculations

Linear Interpolation
Computed as the weighted sum of 2n-1
neighbors
 n=dimensionality of image
 Weighting is based on distance
between requested position and
neighbors

B-spline Interpolation
Intensity calculated by multiplying Bspline coefficients with shifted B-spline
kernels
 Higher spline orders require more pixels
to computer interpolated value
 Third-order B-spline kernels typically
used because good tradeoff between
smoothness and computational burden

Metrics
Scalar function of the set of transform
parameters for a given fixed image,
moving image, and transformation type
 Typically samples points within fixed
image to compute the measure

Mean Squares

Mean squared difference over all the
pixels in an image
1
S ( p | F , M , T) 
N

N
i
[ F ( xi )  M (T( xi , p))] 2
Intensities are interpolated for the
moving image
 For gradient-based optimization,
derivative of metric is also required

Mean Squares
Optimal value of zero
 Interpolator will affect computation time
and smoothness of metric plot
 Assumes intensity representing the
same homologous point is in both
images
 Images must be from same modality

Mean Squares
Smoothness affected by interpolator
Normalized Correlation
Computes pixel-wise cross-correlation
between the intensity of the two images,
normalized by the square root of the
autocorrelation of each image
 For two identical images, metric =1
 Misalignment, metric <1

Normalized Correlation
N
S ( p | F , M , T)  1
 F ( x )  M (T ( x ))
i
i
i
N
N
2
F
(
x
)

M
 i  (T ( xi ))
2
i

i
-1 added for minimum-seeking
optimizers
Normalized Correlation
Difference Density

Each pixel’s contribution is calculated
using bell-shaped function
1
f (d ) 
2
1  (d /  )
f(d) has a maximum of 1 at d=0 and
minimum of zero at d=+/-infinity
 d is difference in intensity b/w F and M

Difference Density

λ controls the rate of drop off
– Corresponds to the difference in intensity
where f(d) has dropped by 50%
Difference Density
S ( p | F , M , T) 
1
1
2
[ F ( xi )  M (T ( xi )) ]
2
Optimal value is N
 Poor matches = small measure values
 Approximates the probability density
function of the difference image and
maximizes its value at zero

Difference Density

Width of peak controlled by λ
Multi-modal Volume Registration by
Maximization of Mutual Information
Wells W, Viola P, Atsumi H, Nakajima S, Kikinis R
Registering Images from Same
Modality
Typical measure of error is sum of
squared differences between voxels
values
 This measure is directly proportional to
the likelihood that the images are
correctly registered
 Same measure is NOT effective for
images of different modalities

Relationship Between Images of
Different Modalities

Example: We should be able to construct a
function F() that predicts CT voxel value from
corresponding MRI value
 Registration could be evaluated by computing
F(MR) and comparing it to the CT image
– Via sum of squared differences (or correlation)

In practice, this is a difficult and underdetermined problem
Mutual Information

Theory: Since MR and CT both describe the
underlying anatomy, there will be mutual
information between the two images
 Find the best registration by maximizing the
information that one image provides about the
other
 Requires no a priori model of the relationship
 Assumes that max. info. is provided when the
images are correctly registered
Notation
Reference (fixed) volume: u(x)
 Test (moving) volume: v(x)
 x: coordinates of the volume
 T: transformation from coordinate frame
of reference volume to test volume
 v(T(x)): test volume voxel associated
with reference volume voxel u(x)

Mutual Information

Defined in terms of entropies
H (A)    p A (a) log p A (a)da
H (B)    pB (b) log pB (b)db
H (A, B)    p AB (a, b) log p AB (a, b)da
I (A, B)  H (A)  H (B)  H (A, B)

If there are any dependencies,
H(A,B)<H(A)+H(B)
Maximizing Mutual Information
Tˆ  arg max I (u ( x), v(T ( x)))
T
I (u ( x), v(T ( x)))  h(u ( x))  h(v(T ( x)))  h(u ( x), v(T ( x)))

h(v(T(x))) encourages transformations that
project u into complex parts of v
 Last term of MI eqn contributes when u and v
are functionally related
 Together, last two terms of MI eqn identify
transforms that find complexity and explain it
well
Parzen Windowing
Used to estimate probability density
P*(z)
 Entropy estimated based on P*(z)

Finding Maximum of I(T)

To find maximum of mutual information,
calculate its derivative:
d
d
d
d
I (T ) 
h * (u ( x)) 
h * (v(T ( x))) 
h * (u ( x), v(T ( x)))
dT
dT
dT
dT
Derivative of reference volume is 0, b/c
not a function of T
 Entropies depend on covariance of
Parzen window functions

Stochastic Maximization of
Mutual Information

Similar to gradient descent
 Steps are taken that are proportional to dI/dT
 Repeat:
– A  {sample of size NA drawn from x}
– B  {sample of size NB drawn from x}
– T  T+λ(dI/dT)
λ is the learning rate
 Repeated a fixed # of times, or until
convergence

Stochastic Approximation

Uses noisy derivative estimates instead of the
true derivative for optimizing a function
 Authors have found that technique always
converges to a transformation estimate that is
close to locally optimal
– NA=NB=50 has been successful

The noise introduced by the sampling can
effectively penetrate small local minima
MRI-CT Example





Coarse to fine registration
Images were smoothed by convolving with
binomial kernel
Rigid transform represented by displacement
vectors and quaternions
Images were sampled and tri-linear
interpolation was used
5 levels of resolution
– 10000, 5000(*4) iterations
Initial Condition of MR-CT
Registration
Final Configuration for MR-CT
Registration
Initial Condition of MR-PET
Registration
Final Configuration for MR-PET
Registration
Application

Register 2 MRIs of brain (SPGR and
T2-weighted) to visualize anatomy and
tumor
– Create at 3-D model for surgical planning
and visualization
3-D Model
Tumor(green), Vessels(red), Ventricles(blue), Edema (orange)
Correlation
Conventional correlation aligns two
signals by minimizing a summed
quadratic difference between their
intensities
 If intensity of one signal is negated, then
intensities no longer agree, and
alignment by correlation will fail
 Mutual information is not affected by
negation of either signal

Occlusion
Correlation is significantly affected by
occlusion because intensity is
substantially different
 Occlusion will reduce mutual
information at alignment

– But “mutual information measure degrades
gracefully when subject to partially
occluded imagery”
Comparison to Other Methods

Many researchers use surface-based methods
to register MRI and PET imagery
– Need for reliable segmentation is a drawback

Others use joint entropy to characterize
registration
– “not robust”: difficulty describing partial overlap
– Mutual Information is better because it has a larger
capture range
• Additional influence from term that rewards for complexity
in portion of test volume into which reference volume is
transformed
Comparison to Other Methods

Woods et al. register MR and PET by
minimizing range of PET values associated
with a particular MR intensity value
– Effective when test volume distribution is
Gaussian
– Mutual Information can handle data that is multimodal
– Woods’ measure is sensitive to noise and outliers
Conclusions
Intensity based techniques work directly
with volumetric data (vs. segmentation)
 Mutual information does not rely on
assumptions about nature of imaging
modalities
 Have also used this technique to
register 3D volumetric images to video
images of patients
