Medical Image Registration: Concepts and Implementation
Download
Report
Transcript Medical Image Registration: Concepts and Implementation
Medical Image Registration:
Concepts and Implementation
Feb 28, 2006
Jen Mercer
Registration
Spatial transform that maps points from
one image to corresponding points in
another image
Registration Criteria
Landmark-based
– Features selected by the user
Segmentation-based
– Rigidly or deformably align binary
structures
Intensity-based
– Minimize intensity difference over entire
image
Spatial Transformation
Rigid
– Rotations and translations
Affine
– Also, skew and scaling
Deformable
– Free-form mapping
Registration Framework
Transforms
x' cos
x'
y' sin
sin x t x
cos y t y
x’=T(x|p)=T(x,y|tx,ty,θ)
Goal: Find parameter values that
optimize image similarity metric
Optimizer
Often require derivative of image
similarity metric (S)
S (p | F , M , )
S (p | F , M , ) x' j
pi
x' j
p j
j
x'1
p
1
x'2
J p1
x'
n
p1
x'1
p2
x'2
p2
x'1
p1
x'1
pm
x'2
pm
x'n
pm
Jacobian and Image Gradient
Identity Transform
Maps every point to itself
Only used for testing
Fixed set (C): set of points that remain
unchanged by transform
Translation Transform
Fixed set is an empty set
Scaling Transform
Isotropic vs. anisotropic
Fixed set is the origin of the coordinates
Scaling and Translation
x'1
x1 C1 C1
x1
C1
x'
x C C
x
C
2
2
2
2
2
2
x'
D
D
(1 D)
x' N
xN C N C N
xN
C N
x1 T1
x T
2
2
D
x N TN
Rotation Transform
Fixed set is the origin
x' cos
x'
y' sin
cos
sin
sin x C x C x
cos y C y C y
sin x 1 cos
sin Cx
C
cos y sin 1 cos y
cos
sin
sin x Tx
cos y Ty
Rotations in Polar Coordinates
x' cos
x'
y' sin
sin x
cos y
i
( x, y ) re (r cos , ir sin )
( x' , y ' ) re
i ( )
i i
re e
Optimization
Search for value of θ that minimizes cost
function S
Gradient descent algorithm
– Update of parameter
S
'
e
i '
i
e e
i
S
i
e exp(G )
– G is the variation from the gradient of the cost
function
– is step length of algorithm
Combined Scaling and Rotation
M
M
GD
i
D
D=scaling factor
M=cost function
Apply transform to a point as:
i ( )
( x' , y' ) Dre
i
i
re De
Add Translation
Find fixed point of transformation
Translation (d) is result of scaling and
rotation
Scaling, Rotation,Translation
i
P' T ( P) ( P C) De C
P=arbitrary point
C=fixed point of transformation
D=scaling factor
Θ=rotation angle
P and C are complex numbers (x+iy) or reiθ
Store derivates of P in Jacobian matrix for
optimizer
Rigid if D=1, otherwise similarity transform
Affine Transformation
Collinearity is preserved
x’=A x + T
A is a complex matrix of coefficients
With fixed point
– x’=A (x–C) + C
A is optimized similar to the scaling
factor
Quaternions
Quotient of two vectors
– Q= A / B
Operator that produces second vector
– A= Q B
Represents orientation of one vector
with respect to another, as well as ratio
of their magnitudes
– Versor-rotates vector
– Tensor-changes vector magnitude
Scalars and Versors
Quaternion represented by 4 numbers
– Versor
• Direction – parallel to axis of rotation
• Rotation angle
• Norm – function of rotation angle
– Tensor
• Magnitude
Unit Sphere Versor Representation
Versor Composition
Versor definition (vector quotient)
– VCB = B / C
– VBA = A / B
– VCA = A / C
Versor composition
– VCA = VBA ◊ VCB
– Not communative
Versor Addition
Optimization of Versors
Versor
exponentiation
–
–
–
–
V2 = V ◊ V
V = V1/2 ◊ V1/2
Θ(V) = θ
Θ(Vn) = nθ
Versor Increment
S (V )
dV
V
Rigid Transform in 3D
Use quaternions instead of phasors
P’=V*(P-C)+C
P’=V*P+T, T=C-V*C
P=point, V=Versor, T=Translation, C=fixed
point
Transform represented by 6 parameters
– Three numbers representing versor
– Three components of fixed coordinate system
Numerical Representation of a
Versor
Right versor
Numerical Representation of a
Versor
-i = k ◊ j
-j = i ◊ k
-k = j ◊ i
Set of elementary quaternions
= [i,j,k]= [eiπ/2 , ejπ/2, ekπ/2]
Numerical Representation of a
Versor
Any right versor can be represented as
– v=xi+yj+zk
– x2+y2+z2=1
Any generic versor can be represented
in terms of the right versor parallel to its
axis and the rotation angle as
– V=evθ
Similarity Transform in 3-D
Replace versor of rigid transform with
quaternion to represent rotation and
scale changes
x’=Q*(x-C)+C
x’=Q*x+T, T=C-Q*C
Image Interpolators
2 functions
– Compute interpolated intensity at
requested position
– Detect whether or not requested position
lies within moving-image domain
Nearest Neighbor
Uses intensity of nearest grid position
Computationally cheap
Doesn’t require floating point
calculations
Linear Interpolation
Computed as the weighted sum of 2n-1
neighbors
n=dimensionality of image
Weighting is based on distance
between requested position and
neighbors
B-spline Interpolation
Intensity calculated by multiplying Bspline coefficients with shifted B-spline
kernels
Higher spline orders require more pixels
to computer interpolated value
Third-order B-spline kernels typically
used because good tradeoff between
smoothness and computational burden
Metrics
Scalar function of the set of transform
parameters for a given fixed image,
moving image, and transformation type
Typically samples points within fixed
image to compute the measure
Mean Squares
Mean squared difference over all the
pixels in an image
1
S ( p | F , M , T)
N
N
i
[ F ( xi ) M (T( xi , p))] 2
Intensities are interpolated for the
moving image
For gradient-based optimization,
derivative of metric is also required
Mean Squares
Optimal value of zero
Interpolator will affect computation time
and smoothness of metric plot
Assumes intensity representing the
same homologous point is in both
images
Images must be from same modality
Mean Squares
Smoothness affected by interpolator
Normalized Correlation
Computes pixel-wise cross-correlation
between the intensity of the two images,
normalized by the square root of the
autocorrelation of each image
For two identical images, metric =1
Misalignment, metric <1
Normalized Correlation
N
S ( p | F , M , T) 1
F ( x ) M (T ( x ))
i
i
i
N
N
2
F
(
x
)
M
i (T ( xi ))
2
i
i
-1 added for minimum-seeking
optimizers
Normalized Correlation
Difference Density
Each pixel’s contribution is calculated
using bell-shaped function
1
f (d )
2
1 (d / )
f(d) has a maximum of 1 at d=0 and
minimum of zero at d=+/-infinity
d is difference in intensity b/w F and M
Difference Density
λ controls the rate of drop off
– Corresponds to the difference in intensity
where f(d) has dropped by 50%
Difference Density
S ( p | F , M , T)
1
1
2
[ F ( xi ) M (T ( xi )) ]
2
Optimal value is N
Poor matches = small measure values
Approximates the probability density
function of the difference image and
maximizes its value at zero
Difference Density
Width of peak controlled by λ
Multi-modal Volume Registration by
Maximization of Mutual Information
Wells W, Viola P, Atsumi H, Nakajima S, Kikinis R
Registering Images from Same
Modality
Typical measure of error is sum of
squared differences between voxels
values
This measure is directly proportional to
the likelihood that the images are
correctly registered
Same measure is NOT effective for
images of different modalities
Relationship Between Images of
Different Modalities
Example: We should be able to construct a
function F() that predicts CT voxel value from
corresponding MRI value
Registration could be evaluated by computing
F(MR) and comparing it to the CT image
– Via sum of squared differences (or correlation)
In practice, this is a difficult and underdetermined problem
Mutual Information
Theory: Since MR and CT both describe the
underlying anatomy, there will be mutual
information between the two images
Find the best registration by maximizing the
information that one image provides about the
other
Requires no a priori model of the relationship
Assumes that max. info. is provided when the
images are correctly registered
Notation
Reference (fixed) volume: u(x)
Test (moving) volume: v(x)
x: coordinates of the volume
T: transformation from coordinate frame
of reference volume to test volume
v(T(x)): test volume voxel associated
with reference volume voxel u(x)
Mutual Information
Defined in terms of entropies
H (A) p A (a) log p A (a)da
H (B) pB (b) log pB (b)db
H (A, B) p AB (a, b) log p AB (a, b)da
I (A, B) H (A) H (B) H (A, B)
If there are any dependencies,
H(A,B)<H(A)+H(B)
Maximizing Mutual Information
Tˆ arg max I (u ( x), v(T ( x)))
T
I (u ( x), v(T ( x))) h(u ( x)) h(v(T ( x))) h(u ( x), v(T ( x)))
h(v(T(x))) encourages transformations that
project u into complex parts of v
Last term of MI eqn contributes when u and v
are functionally related
Together, last two terms of MI eqn identify
transforms that find complexity and explain it
well
Parzen Windowing
Used to estimate probability density
P*(z)
Entropy estimated based on P*(z)
Finding Maximum of I(T)
To find maximum of mutual information,
calculate its derivative:
d
d
d
d
I (T )
h * (u ( x))
h * (v(T ( x)))
h * (u ( x), v(T ( x)))
dT
dT
dT
dT
Derivative of reference volume is 0, b/c
not a function of T
Entropies depend on covariance of
Parzen window functions
Stochastic Maximization of
Mutual Information
Similar to gradient descent
Steps are taken that are proportional to dI/dT
Repeat:
– A {sample of size NA drawn from x}
– B {sample of size NB drawn from x}
– T T+λ(dI/dT)
λ is the learning rate
Repeated a fixed # of times, or until
convergence
Stochastic Approximation
Uses noisy derivative estimates instead of the
true derivative for optimizing a function
Authors have found that technique always
converges to a transformation estimate that is
close to locally optimal
– NA=NB=50 has been successful
The noise introduced by the sampling can
effectively penetrate small local minima
MRI-CT Example
Coarse to fine registration
Images were smoothed by convolving with
binomial kernel
Rigid transform represented by displacement
vectors and quaternions
Images were sampled and tri-linear
interpolation was used
5 levels of resolution
– 10000, 5000(*4) iterations
Initial Condition of MR-CT
Registration
Final Configuration for MR-CT
Registration
Initial Condition of MR-PET
Registration
Final Configuration for MR-PET
Registration
Application
Register 2 MRIs of brain (SPGR and
T2-weighted) to visualize anatomy and
tumor
– Create at 3-D model for surgical planning
and visualization
3-D Model
Tumor(green), Vessels(red), Ventricles(blue), Edema (orange)
Correlation
Conventional correlation aligns two
signals by minimizing a summed
quadratic difference between their
intensities
If intensity of one signal is negated, then
intensities no longer agree, and
alignment by correlation will fail
Mutual information is not affected by
negation of either signal
Occlusion
Correlation is significantly affected by
occlusion because intensity is
substantially different
Occlusion will reduce mutual
information at alignment
– But “mutual information measure degrades
gracefully when subject to partially
occluded imagery”
Comparison to Other Methods
Many researchers use surface-based methods
to register MRI and PET imagery
– Need for reliable segmentation is a drawback
Others use joint entropy to characterize
registration
– “not robust”: difficulty describing partial overlap
– Mutual Information is better because it has a larger
capture range
• Additional influence from term that rewards for complexity
in portion of test volume into which reference volume is
transformed
Comparison to Other Methods
Woods et al. register MR and PET by
minimizing range of PET values associated
with a particular MR intensity value
– Effective when test volume distribution is
Gaussian
– Mutual Information can handle data that is multimodal
– Woods’ measure is sensitive to noise and outliers
Conclusions
Intensity based techniques work directly
with volumetric data (vs. segmentation)
Mutual information does not rely on
assumptions about nature of imaging
modalities
Have also used this technique to
register 3D volumetric images to video
images of patients