lect12_review
Download
Report
Transcript lect12_review
From Pixels to Features:
Review of Part 1
COMP 4900C
Winter 2008
Topics in part 1 – from pixels to features
• Introduction
• what is computer vision? It’s applications.
• Linear Algebra
• vector, matrix, points, linear transformation, eigenvalue,
eigenvector, least square methods, singular value decomposition.
• Image Formation
• camera lens, pinhole camera, perspective projection.
• Camera Model
• coordinate transformation, homogeneous coordinate,
intrinsic and extrinsic parameters, projection matrix.
• Image Processing
• noise, convolution, filters (average, Gaussian, median).
• Image Features
• image derivatives, edge, corner, line (Hough transform), ellipse.
General Methods
• Mathematical formulation
• Camera model, noise model
• Treat images as functions
I f ( x, y)
• Model intensity changes as derivatives f [I x , I y ]T
• Approximate derivative with finite difference.
• First-order approximation
I (i u, j v) I (i, j) I xu I y v I (i, j) u vf
• Parameter fitting – solving an optimization
problem
Vectors and Points
We use vectors to represent points in 2 or 3 dimensions
y
v
Q(x2,y2)
P(x1,y1)
x
x2 x1
vQP
y
y
1
2
The distance between the two points:
D Q P ( x2 x1 ) 2 ( y2 y1 ) 2
Homogeneous Coordinates
Go one dimensional higher:
wx
x
y wy
w
wx
x
y wy
wz
z
w
w is an arbitrary non-zero scalar, usually we choose 1.
From homogeneous coordinates to Cartesian coordinates:
x1
x x1 / x3
2 x2 / x3
x3
x1
x x1 / x4
2 x2 / x4
x3
x x3 / x4
4
2D Transformation with Homogeneous Coordinates
2D coordinate transformation:
cos
p' '
sin
sin p x Tx
cos p y Ty
2D coordinate transformation using homogeneous coordinates:
p x ' ' cos
p ' ' sin
y
1 0
sin Tx p x
cos Ty p y
0
1 1
Eigenvalue and Eigenvector
We say that x is an eigenvector of a square matrix A if
Ax x
is called eigenvalue and x is called eigenvector.
The transformation defined by A changes only the
magnitude of the vector x
Example:
3 2 2 4
2
3 2 1 5
1
1 4 1 5 51 and 1 4 1 2 2 1
1
2
5 and 2 are eigenvalues, and 1 and 1 are eigenvectors.
Symmetric Matrix
We say matrix A is symmetric if
AT A
Example: B T B
is symmetric for any B, because
( BT B)T BT ( BT )T BT B
A symmetric matrix has to be a square matrix
Properties of symmetric matrix:
•has real eignvalues;
•eigenvectors can be chosen to be orthonormal.
• BT B has positive eigenvalues.
Orthogonal Matrix
A matrix A is orthogonal if
A A I
T
or
A A
T
1
The columns of A are orthogonal to each other.
Example:
cos
A
sin
sin
cos
cos
A
sin
1
sin
cos
Least Square
When m>n for an m-by-n matrix A,
Ax b
has no solution.
In this case, we look for an approximate solution.
We look for vector x such that
Ax b
2
is as small as possible.
This is the least square solution.
Least Square
Least square solution of linear system of equations
Ax b
Normal equation:
T
A A
A Ax A b
T
is square and symmetric
The Least square solution
makes
T
Ax b
2
1
x ( A A) A b
minimal.
T
T
SVD: Singular Value Decomposition
An mn matrix A can be decomposed into:
A UDV
T
U is mm, V is nn, both of them have orthogonal columns:
U U I
T
V V I
T
D is an mn diagonal matrix.
Example:
2 0 1 0 0 2 0
0 3 0 1 0 0 3 1 0
0 1
0 0 0 0 1 0 0
Pinhole Camera
Why Lenses?
Gather more light from each scene
point
Four Coordinate Frames
xim
Yc
camera
frame
optical
center
Camera model:
y
pim
pixel
frame
image
plane
frame
Yw
Zw
yim
Xc
Zc
Pw
world
frame
x
principal
point
transformation
pim
Pw
matrix
Xw
Perspective Projection
P
X
Y
x f
y f
Z
Z
p
y
optical
center
x
principal
point
These are nonlinear.
principal
axis
image
plane
Using homogenous coordinate, we have a linear relation:
u f
v 0
w 0
xu/w
0
f
0
X
0 0
Y
0 0
Z
1 0
1
y v/w
World to Camera Coordinate
Transformation between the camera and world coordinates:
Xc RXw T
R,T
Xc
Y
c R
Z c 0
1
X w
T Yw
1 Z w
1
Image Coordinates to Pixel Coordinates
x (ox xim ) sx y (o y yim ) s y
s x , s y : pixel sizes
xim
yim
y
x (ox,oy)
xim 1 / s x
y 0
im
1 0
0
1/ s y
0
ox x
oy y
1 1
Put All Together – World to Pixel
x1 1 / s x
x 0
2
x3 0
1 / s x
0
0
1 / s x
0
0
0
1/ sy
0
0
1/ sy
0
0
1/ sy
0
f / s x
0
0
xim x1 / x3
0
f / sy
0
ox u
o y v
1 w
ox f
o y 0
1 0
0
f
0
X
0 0 c
Y
0 0 c
Zc
1 0
1
X w
0 0
R T Yw
f 0 0
0 1 Z w
0 1 0
1
X w
ox 1 0 0 0
R T Yw
o y 0 1 0 0
Z K R
0
1
w
1 0 0 1 0
1
ox f
o y 0
1 0
0
yim x2 / x3
Xw
Y
T w
Zw
1
Camera Intrinsic Parameters
f / s x
K 0
0
0
f / sy
0
ox
oy
1
K is a 3x3 upper triangular matrix, called the
Camera Calibration Matrix.
There are five intrinsic parameters:
(a) The pixel sizes in x and y directions s x , s y
(b) The focal length f
(c) The principal point (ox,oy), which is the point
where the optic axis intersects the image plane.
Extrinsic Parameters
Xw
X w
x1
Y
Y
pim x2 K R T w M w
Zw
Zw
x3
1
1
[R|T] defines the extrinsic parameters.
The 3x4 matrix M = K[R|T] is called the projection matrix.
Image Noise
Additive and random noise:
Iˆx, y I x, y nx, y
I(x,y) : the true pixel values
n(x,y) : the (random) noise at pixel (x,y)
Gaussian Distribution
Single variable
1
( x ) 2 / 2 2
p( x)
e
2
Gaussian Distribution
2
Bivariate with zero-means and variance
x2 y2
G x, y
exp
2
2
2
2
1
Gaussian Noise
Is used to model additive random noise
n2
2 2
•The probability of n(x,y) is e
•Each has zero mean
•The noise at each pixel is independent
Impulsive Noise
• Alters random pixels
• Makes their values very different from the true ones
Salt-and-Pepper Noise:
• Is used to model impulsive noise
I h, k
xl
I sp h, k
imin yimax imin x l
x, y are uniformly distributed random
variables
l , imin,imaxare constants
Image Filtering
Modifying the pixels in an image based on some
function of a local neighbourhood of the pixels
N(p)
p
10
30
10
20
11
20
11
9
1
f p
5.7
Linear Filtering – convolution
The output is the linear combination of the neighbourhood pixels
I A (i, j ) I * A
m/2
m/2
A(h, k ) I (i h, j k )
h m / 2 k m / 2
The coefficients come from a constant matrix A, called kernel.
This process, denoted by ‘*’, is called (discrete) convolution.
1
3
0
2
10
2
4
1
1
Image
1
0
1
0.1 -1
1
0
Kernel
-1
=
5
-1
Filter Output
Smoothing by Averaging
1
*
9
1
1
1
1
1
1
1
1
1
Convolution can be understood as weighted averaging.
Gaussian Filter
x2 y2
G x, y
exp
2
2
2
2
1
Discrete Gaussian kernel:
G (h, k )
1
2
2
e
h2 k 2
2 2
where Gh, k is an elementof an m m array
Gaussian Filter
*
1
Gaussian Kernel is Separable
IG I G
m/2
m/2
G (h, k ) I (i h, j k )
h m / 2 k m / 2
m/2
m/2
e
e
e
e
k2
2 2
I (i h, j k )
k m / 2
h2 k 2
2
I (i h, j k )
2 2
h m / 2 k m / 2
2
m/2 h
m/2
2
2
h m / 2
since
h2 k 2
2
e
h2
2
2
e
k2
2 2
Gaussian Kernel is Separable
Convolving rows and then columns with a 1-D Gaussian kernel.
I
1
38
9 18 9
1
1
=
Ir
1
Ir
1
38
9
18
9
=
result
1
2
m
m
The complexity increases linearly with instead of with .
Gaussian vs. Average
Gaussian Smoothing
Smoothing by Averaging
Nonlinear Filtering – median filter
Replace each pixel value I(i, j) with the median of the values
found in a local neighbourhood of (i, j).
Median Filter
Salt-and-pepper noise
After median filtering
Edges in Images
Definition of edges
•
•
Edges are significant local changes of intensity in an image.
Edges typically occur on the boundary between two different regions in an image.
Images as Functions
2-D
Red channel intensity
I f ( x, y)
Finite Difference – 2D
Continuous function:
f x, y
f x h, y f x, y
lim
h 0
x
h
f x, y
f x, y h f x, y
lim
h 0
y
h
Discrete approximation:
Ix
f x, y
f i 1, j f i , j
x
f x, y
Iy
f i , j 1 f i , j
y
Convolution kernels:
1 1
1
1
Image Derivatives
I x I *1 1
Image I
1
Iy I *
1
Edge Detection using Derivatives
1-D image f (x)
1st derivative f ' ( x)
f ' ( x) threshold
Pixels that pass the
threshold are
edge pixels
Image Gradient
gradient
fx
f f
y
magnitude
f
f 2
x
direction
arctan(fy / fx )
f 2
y
Finite Difference for Gradient
Discrete approximation:
I x (i, j )
Convolution kernels:
f
f i 1, j f i , j
x
1 1
1
1
f
I y (i, j )
f i , j 1 f i , j
y
magnitude
G(i, j ) I (i, j ) I (i, j )
2
x
aprox. magnitude
direction
2
y
G (i, j ) I x I y
arctan(I y / I x )
Edge Detection Using the Gradient
Properties of the gradient:
• The magnitude of gradient
provides information about the
strength of the edge
• The direction of gradient is
always perpendicular to the
direction of the edge
Main idea:
• Compute derivatives in x and y directions
• Find gradient magnitude
• Threshold gradient magnitude
Edge Detection Algorithm
1 1
*
Ix
edges
I x2 I y2
Threshold
Image I
*
1
1
Iy
Edge Detection Example
Ix
I
Iy
Edge Detection Example
G(i, j ) I x2 (i, j ) I y2 (i, j )
I
G(i, j ) Threshold
Finite differences responding to noise
Increasing noise ->
(this is zero mean additive gaussian noise)
Solution: smooth first
Where is the edge? Look for peaks in
Sobel Edge Detector
Approximate derivatives with
central difference
f
I x (i, j )
f i 1, j f i 1, j
x
Smoothing by adding 3 column
neighbouring differences and give
more weight to the middle one
Convolution kernel for I y
Convolution kernel
1
0 1
1 0 1
2 0 2
1 0 1
2
1
1
0
0
0
1 2 1
Sobel Operator Example
a1
a4
a7
a2
a5
a8
a3
a6
a9
a1
a4
a7
a2
a5
a8
a3
a6
a9
*
1 0 1
2 0 2
1 0 1
*
2
1
1
0
0
0
1 2 1
The approximate gradient at a5
I x (a1 a3 ) 2(a4 a6 ) (a7 a9 )
I y (a1 a7 ) 2(a2 a8 ) (a3 a9 )
Sobel Edge Detector
*
1 0 1
2 0 2
1 0 1
Ix
I x2 I y2
edges
Threshold
Image I
*
2
1
1
0
0
0
1 2 1
Iy
Edge Detection Summary
Input: an image I and a threshold .
1. Noise smoothing: I s I h
(e.g. h is a Gaussian kernel)
2. Compute two gradient images I x and I y by convolving I s
with gradient kernels (e.g. Sobel operator).
3. Estimate the gradient magnitude at each pixel
G(i, j ) I x2 (i, j ) I y2 (i, j )
4. Mark as edges all pixels (i, j ) such that G(i, j )
Corner Feature
Corners are image locations that have large intensity changes
in more than one directions.
Shifting a window in any direction should give a large
change in intensity
Harris Detector: Basic Idea
“flat” region:
no change in
all directions
“edge”:
no change along
the edge direction
“corner”:
significant change
in all directions
C.Harris, M.Stephens. “A Combined Corner and Edge Detector”. 1988
Change of Intensity
The intensity change along some direction can be quantified
by sum-of-squared-difference (SSD).
D(u, v) I (i u, j v) I (i, j )
2
i, j
u
v
I (i, j )
I (i u, j v)
Change Approximation
If u and v are small, by Taylor theorem:
I (i u, j v) I (i, j) I xu I y v
where
I
Ix
x
I
and I y
y
therefore
I (i u, j v) I (i, j )
2
I (i, j ) I x u I y v I (i, j )
2
I x u I y v
2
I x2u 2 2 I x I y uv I y2 v 2
I x2
u v
I x I y
I x I y u
2
I y v
Gradient Variation Matrix
I x2
D(u, v) u v
I x I y
I I
I
x y
2
y
u
v
This is a function of ellipse.
I x2
C
I x I y
I I
I
x y
2
y
Matrix C characterizes how intensity changes
in a certain direction.
Eigenvalue Analysis
I x2
C
I x I y
I I
I
x y
2
y
0
T 1
Q
Q
0 2
If either λ is close to 0, then this is not a corner, so look for
locations where both are large.
I
C Ix
I y
x
I x
C I x I y AT A
I y
•C is symmetric
•C has two positive eigenvalues
Iy
(max)-1/2
(min)-1/2
Corner Detection Algorithm
Line Detection
The problem:
•How many lines?
•Find the lines.
Equations for Lines
The slope-intercept equation of line
y ax b
What happens when the line is vertical? The slope a goes
to infinity.
A better representation – the polar representation
x cos y sin
Hough Transform: line-parameter mapping
A line in the plane maps to a point in the - space.
(,)
All lines passing through a point map to a sinusoidal curve in the
- (parameter) space.
x cos y sin
Mapping of points on a line
Points on the same line define curves in the parameter space
that pass through a single point.
Main idea: transform edge points in x-y plane to curves in the
parameter space. Then find the points in the parameter space
that has many curves passing through.
Quantize Parameter Space
m
Detecting Lines by finding maxima / clustering in parameter space.
m
Examples
input image
Hough space
lines detected
Image credit: NASA Dryden Research Aircraft Photo Archive
Algorithm
1. Quantize the parameter space
int P[0, max][0, max]; // accumulators
2. For each edge point (x, y) {
For ( = 0; <= max; = +) {
x cos y sin // round off to integer
(P[][])++;
}
}
3. Find the peaks in P[][].
Equations of Ellipse
x2 y 2
2 1
2
r1 r2
ax bxy cy dx ey f 0
2
2
Let
x [ x2 , xy, y 2 , x, y,1]T
a [a, b, c, d , e, f ]
T
T
x
a0
Then
Ellipse Fitting: Problem Statement
pi [ xi , yi ]T
Given a set of N image points
find the parameter vector a 0such that the ellipse
f (p, a) xT a 0
fits pi best in the least square sense:
N
min D(p i , a)
a
2
i 1
Where D(pi , a) is the distance from pi to the ellipse.
Euclidean Distance Fit
pi
pˆ i
D(pi , a) pˆ i pi
pˆ i is the point on the ellipse that
is nearest to pi
f (pˆ i , a) 0
pˆ i pi is normal to the ellipse at pˆ i
Compute Distance Function
Computing the distance function is a constrained optimization
problem:
min pˆ i pi
pˆ i
2
subject to
f (pˆ i , a) 0
Using Lagrange multiplier, define:
L( x, y, ) pˆ i p i
2
2f (pˆ i , a)
where pˆ i [ x, y]T
L ( x, y , )
Then the problem becomes: min
ˆ
pi
Set
L L
0
x y
we have
pˆ i pi f (pˆ i , a)
Ellipse Fitting with Euclidean Distance
Given a set of N image points pi [ xi , yi ]
find the parameter vector a 0such that
T
N
min
a
i 1
f (p i , a)
2
f (p i , a)
2
This problem can be solved by using a numerical nonlinear
optimization system.