Transcript Document

Gebze Institute of Technology Department of Computer Engineering Computer Vision Stereo

© 2005 Yusuf Akgul

Stereo

The ability to infer information on the 3D structure and distance of a scene from two or more images taken from different viewpoints.

© 2005 Yusuf Akgul

Two Problems Of Stereo

• • We need to solve two problems for estimating 3D structure and distance

Correspondence problem

is determining which parts of the left and right (or other) images correspond to each other.

Reconstruction problem

uses the correspondences and the camera geometry to recover the 3D structure and the distance.

© 2005 Yusuf Akgul

O 1

Finding Correspondences

P Q P’ 1 Q’ 1 P’ 2 Q’ 2 O 2 © 2005 Yusuf Akgul

Finding Correspondences:

© 2005 Yusuf Akgul

3D Reconstruction

P P’ 1 P’ 2 O 1 We must solve the correspondence problem first!

© 2005 Yusuf Akgul O 2

Correspondence and 3D reconstruction

© 2005 Yusuf Akgul

A simple stereo system

P f x l O l p l T Z x r p r O r Disparity: T is the stereo baseline d measures the difference in retinal position between corresponding points © 2005 Yusuf Akgul

A simple stereo system

P f x l O l p l T Z x r p r O r •Depth is inversely proportional to disparity © 2005 Yusuf Akgul

Simple Stereo Model

x

1  

f X Z x

2  

f X

B

x

1 

Z f B Z

© 2005 Yusuf Akgul

Z

x

1

fB

x

2

f

Parameters of a stereo system

P •Intrinsic: •f: focal length of cameras •c l and c r : principal points Z c l x l O l p l T p r x r c r O r •Extrinsic: •T: stereo baseline •Transformation between cameras for a more general configuration © 2005 Yusuf Akgul

The Correspondence Problem

• Basic assumptions: – Most scene points are visible in both images – Corresponding image regions are similar • These assumptions hold if: – The distance of the

fixation point

from the cameras is much larger than the stereo baseline: Z >> T © 2005 Yusuf Akgul

The Correspondence Problem

• Is a “search” problem: – Given an element in the left image, search for the corresponding element in the right image.

• We must choose: – Elements to match – A similarity measure to compare elements © 2005 Yusuf Akgul

Correspondence Problem

• Two classes of algorithms: – Correlation-based algorithms • Produce a DENSE set of correspondences – Feature-based algorithms • Produce a SPARSE set of correspondences © 2005 Yusuf Akgul

Correlation-based Algorithms

• Elements to be matched by a similarity measure: – Image WINDOWS of fixed size.

What can we use for similarity?

© 2005 Yusuf Akgul

Correlation Based Correspondence

© 2005 Yusuf Akgul

Comparing Windows:

Some possible measures: f ?

= g Most popular © 2005 Yusuf Akgul

“SSD” or “block matching” (Sum of Squared Differences)

It is the most popular.

© 2005 Yusuf Akgul

Cross-Correlation C

fg Is also very popular and it is closely related to the SSD: © 2005 Yusuf Akgul

Finding the disparity map

• Inputs: – Left image I l – Right image I r • Parameters that must be chosen: – Correlation Window size 2W+1 – Search Window size w – Similarity measure Y © 2005 Yusuf Akgul

CORR_MATCHING Algorithm

• Let p l and p r be pixels on the I l and I r • Let R(p l ) be the search window w x w associated with p l • Let d be the displacement between p l in R(p l ).

on I r and a point p l 2W+1 w d 2W+1 © 2005 Yusuf Akgul

CORR_MATCHING Algorithm

• For each pixel p l =[i,j] in I l do: – For each displacement d=[d 1 ,d 2 ] in R(p l ) do: C(d) =  k=-W k=W  k=-W k=W Y (I l (i+k,j+l),I r (i+k-d 1 ,j+l-d 2 )) – The disparity of p l is the vector d that maximizes C(d) over R(p l ) • Output the disparity for each pixel p l © 2005 Yusuf Akgul

• W

How do we set W, R and

w (width of the correlation window):

?

– should be based on the “scale” of the scene.

• R(P l ) and w (search window): – Size should be estimated based on the range of scene distances and the baseline: • Z = fT/d or d = fT/Z – The position of R(Pl) is centered around the same pixel on both images.

© 2005 Yusuf Akgul

Feature-based Methods

• Conceptually very similar to Correlation based methods, but: – They only search for correspondences of a sparse set of image features.

– Correspondences are given by the most similar feature pairs.

– Similarity measure must be adapted to the type of feature used.

© 2005 Yusuf Akgul

Feature-based Methods:

• Features most commonly used: – Corners • Similarity measured in terms of: – surrounding gray values (SSD, Cross-correlation) – location – Edges, Lines • Similarity measured in terms of: – orientation – contrast – coordinates of edge or line’s midpoint – length of line © 2005 Yusuf Akgul

Example: Comparing lines

• • l l q l and l r : line lengths and q r : line orientations • (x l ,y l ) and (x r ,y r ): midpoints • • c l w l and c r : average contrast along lines w q w m w c : weights controlling influence The more similar the lines, the larger S is!

© 2005 Yusuf Akgul

FEATURE_MATCHING Algorithm

• Inputs: – I l and I r – Set of features on the left and right • Things that must be chosen: – Search Window – Similarity measure © 2005 Yusuf Akgul

FEATURE_MATCHING Algorithm

• For each feature f l in the left image: – Compute the similarity measure between f l and every feature in the search window R(f l ) – Select the feature in R(f l ) that maximizes the similarity measure.

– Save the correspondence and the disparity of f l • Output the list of correspondences and disparities. © 2005 Yusuf Akgul

Which method should we use?

• Correlation methods: – dense maps, good for surface reconstruction – Require textured images – Sensitive to illumination variations – Inadequate for very different viewpoints • Feature methods: – Sparse maps, good for navigation – Require prior knowledge of type of scene – Must find features first © 2005 Yusuf Akgul

Stereo with Parallel Cameras

• Stereo with Parallel Axes – Short baseline • large common FOV • large depth error – Long baseline • small depth error • small common FOV • More occlusion problems

FOV Left right

© 2005 Yusuf Akgul

Stereo with Parallel Cameras

• Stereo with Parallel Axes – Short baseline • large common FOV • large depth error – Long baseline • small depth error • small common FOV • More occlusion problems • Depth Accuracy vs. Depth – Depth Error is proportional to Depth 2 – Nearer the point, better the depth estimation

FOV Left right

© 2005 Yusuf Akgul

Stereo with Converging Cameras

• Two optical axes intersect at the Fixation Point – converging angle q – The common FOV Increases

FOV Fixation point

q © 2005 Yusuf Akgul

Left right

Stereo with Converging Cameras

• Disparity properties – Disparity uses angle instead of distance – Zero disparity at fixation point • and the Zero-disparity horopter

Horopter Fixation point

q © 2005 Yusuf Akgul

Left

a

l

a

r =

a

l d

a

= 0

a

right r

Stereo with Converging Cameras

Disparity properties – Disparity uses angle instead of distance – Zero disparity at fixation point • and the Zero-disparity horopter – Disparity increases with the distance of objects from the fixation points • >0 : outside of the horopter • <0 : inside the horopter

Horopter Fixation point

q a

l

a

r

a

r >

a

l d

a

> 0 right Left

© 2005 Yusuf Akgul

Stereo with Converging Cameras

Disparity properties – Disparity uses angle instead of distance – Zero disparity at fixation point • and the Zero-disparity horopter – Disparity increases with the distance of objects from the fixation points • >0 : outside of the horopter • <0 : inside the horopter

Horopter Fixation point

a

r

a

L

a

r <

a

l d

a

< 0 right Left

© 2005 Yusuf Akgul

Stereo with Converging Cameras

• Disparity properties – Disparity uses angle instead of distance – Zero disparity at fixation point • and the Zero-disparity horopter – Disparity increases with the distance of objects from the fixation points • >0 : outside of the horopter • <0 : inside the horopter

Horopter Fixation point

• Depth Accuracy vs. Depth – Depth Error is proportional to Depth 2 – Nearer the point, better the depth estimation © 2005 Yusuf Akgul

Left

a

l

D(

d

a) ?

a

r right

Constraining the Search Space

• We could have used additional constraints in both feature based and correlation based algorithms.

What constraints can we use?

© 2005 Yusuf Akgul

Constraining the Search Space

• We could have used additional constraints in both feature based and correlation based algorithms.

• Uniqueness constraint • Continuity constraint • Geometric constraints (we will look at this next) © 2005 Yusuf Akgul

• Motivation: where to search correspondences?

Epipolar Geometry

– Epipolar Plane • A plane going through point P and the centers of projections (COPs) of the two cameras

Epipolar Plane

– Conjugated Epipolar

P l P P r

Lines • Lines where epipolar plane intersects the image planes – Epipoles • The image of the COP of one camera in the other • Epipolar Constraint – Corresponding points must lie on conjugated epipolar lines

O l p l

© 2005 Yusuf Akgul

e l Epipolar Lines Epipoles e r p r O r

Epipolar Geometry

P

• How do we find the

P l

epipolar lines?

Epipolar Plane

• What do we need to find them?

Epipolar Lines p l P r p r O l e l Epipoles e r

© 2005 Yusuf Akgul

O r

Epipolar Geometry

P

• Notations – P l =(X l , Y l , Z l ), P r =(X r , Y r , Z r ) • Vectors of the same 3-D point P, in the left and right camera coordinate systems respectively – Extrinsic Parameters • Translation Vector T = (O r -O l ) • Rotation Matrix R

P r

R(P l

T) X l Y l O l p l f l Z l P l R, T

– p l =(x l , y l , z l ), p r =(x r , y r , z r ) • Projections of P on the left and right image plane respectively • For all image points, we have z l =f l , z r =f r

p

© 2005 Yusuf Akgul

l

f l Z l

P l P r Z r X r p

r

f r Z r

P

r

p r Y r f r O r

Essential Matrix

• Equation of the epipolar plane – Co-planarity condition of vectors P l , T and P l -T

(P l

T)

T

T

P l

 0

P r

R(P l

• Essential Matrix E = RS (

AB

– 3x3 matrix constructed from R and T (extrinsic only) )

T

 

T)

B T A T

• Rank (E) = 2, two equal nonzero singular values

R

    

r r r

11 21 31

r

12

r

22

r

32

r

13

r

23

r

33    

Rank (R) =3

S

      0

T z T y

T z

0

T x

T y T x

0    

Rank (S) =2 P r T EP l

 0

p

l

f Z l l

P l p

r

f Z r r

P

r

p r T Ep l

 0 © 2005 Yusuf Akgul

Essential Matrix

• Essential Matrix E = RS

p r T Ep l

 0 – A natural link between the stereo point pair and the extrinsic parameters of the stereo system • One correspondence -> a linear equation of 9 entries • Given 8 pairs of (pl, pr) -> E – Mapping between points and epipolar lines we are looking for • Given p l , E -> p r on the projective line in the right plane • Equation represents the epipolar line of pr (or pl) in the right (or left) image • Note: – pl, pr are in the camera coordinate system, not pixel coordinates that we can measure © 2005 Yusuf Akgul

Fundamental Matrix

• Mapping between points and epipolar lines in the pixel coordinate systems – With no prior knowledge on the stereo system • From Camera to Pixels: Matrices of intrinsic parameters

M

int       0 0

f x

0 

f y

0

o

1

x o y

   

Rank (M int ) =3

• Questions: – What are fx, fy, ox, oy ?

– How to measure p l in images?

p l p r

M l

1 p l p T Ep r T F l p l

p

r

0   0

M

r

1 p

r

F

M

r

T EM

l

 1 © 2005 Yusuf Akgul

Fundamental Matrix

• Fundamental Matrix – Rank (F) = 2

F

M

r

T EM

l

 1 – Encodes info on both intrinsic and extrinsic parameters – Enables full reconstruction of the epipolar geometry – In pixel coordinate systems without any knowledge of the intrinsic and extrinsic parameters – Linear equation of the 9 entries of F

p r T F p l

 0 ( (

l x im

) (

y im l

) 1 )    

f f

11 21

f

31

f

12

f

22

f

32

f f f

13 23      33   1 (

x im r

) (

y im r

)    0 © 2005 Yusuf Akgul

E = RS © 2005 Yusuf Akgul

R

    

r

11

r

21

r

31

r

12

r

22

r

32

r

13

r

23

r

33    

S

     

T

0

T z y

T z

0

T x T y

T x

0    

F

M

r

T EM

l

 1

Essential and Fundamental Matrix Properties

M

int       0 0

f x

0 

f y

0

o

1

x o y

    © 2005 Yusuf Akgul

How Do We Estimate E and F?

The idea is simple: • Establish correspondences between two stereo images.

• Each correspondence gives us an equation • Solve the linear set of equations to get the Matrix (F) elements.

• How do we solve a homogenous system of linear equations?

© 2005 Yusuf Akgul

• Singular Value Decomposition: – Any mxn matrix can be written as the product of three matrices

A

UDV

T

V 1 U 1

      

a a a

11 21

m

1

a

12

a

22

a m

2

a a

1

n

2

n a mn

          

u u

   

u

11 21

m

1

u

12

u

22

u m

2

u

1

m u

2

m u mm

              s 0 0 0 1 0 s 2 s 0 0

n

           

v

11

v

12 

v

1

n v

21

v

22

v

2

n

  Singular values s i  are fully determined D is diagonal: d ij by A =0 if i  j; dii = s i (i=1,2,…,n)  s 1  s 2  …  s N  0 Both U and V are not unique  Columns of each are mutual orthogonal vectors © 2005 Yusuf Akgul

v v n

1

v n

2

nn

     

Singular Value Decomposition

• 1. Singularity and Condition Number

A

UDV

T

– nxn A is nonsingular IFF all singular values are nonzero – Condition number : degree of singularity of A

C

 s 1 / s

n

• A is ill-conditioned if 1/C is comparable to the arithmetic precision of your machine; almost singular • 2. Rank of a square matrix A – Rank (A) = number of nonzero singular values • 3. Inverse of a square Matrix – If A is nonsingular – In general, the pseudo-inverse of A

A

 

VD

 1 0

U

T

• 4. Eigenvalues and Eigenvectors – Eigenvalues of both A T A and AA T are si 2 (si > 0) – The columns of U are the eigenvectors of AA T (mxm) – The columns of V are the eigenvectors of A T A (nxn)

AA

T

u

i

A

T

Av

i

 s  s

i i

2 2

u

i

v

i

© 2005 Yusuf Akgul

Singular Value Decomposition

• Homogeneous System – m equations for n unknowns

x

(m >= n-1) – Rank (A) = n-1 (by looking at the SVD of A)

Ax

0

– A non-trivial solution (up to a arbitrary scale) by SVD: – Simply proportional to the eigenvector corresponding to the only zero eigenvalue of A T A (nxn matrix) • Note:

A

T

Av

 s

i

– All the other eigenvalues are positive because

i

2

v

i

Rank (A)=n-1 – In practice, the eigenvector (i.e. v n ) corresponding to the minimum eigenvalue of A T A, i.e. s n 2 © 2005 Yusuf Akgul

Singular Value Decomposition

• Problem Statements – Numerical estimate of a matrix A whose entries are not independent – Errors introduced by noise alter the estimate to  • Enforcing Constraints by SVD – Take orthogonal matrix A as an example – Find the closest matrix to Â, which satisfies the constraints exactly • SVD of  

UDV

T

• Observation: D = I (all the singular values are 1) if A is orthogonal • Solution: changing the singular values to those expected © 2005 Yusuf Akgul

Computing F: The Eight-point Algorithm

• Input: n point correspondences ( n >= 8) – Construct homogeneous system Ax= 0 from

p r T F

• x = (f 11 ,f 12 , ,f 13 , f 21 ,f 22 ,f 23 f 31 ,f 32 , f 33 ) : entries in F • Each correspondence give one equation

p l

 0 • A is a nx9 matrix – Obtain estimate F^ by SVD of A

A

UDV

T

• x (up to a scale) is column of V corresponding to the least singular value 

UDV

T

– Enforce singularity constraint: since Rank (F) = 2 • Compute SVD of F^

F'

UD

'

V

T

• Set the smallest singular value to 0: D -> D’ • Correct estimate of F : • Output: the estimate of the fundamental matrix, F’ • Similarly we can compute E given intrinsic parameters © 2005 Yusuf Akgul

Locating the Epipoles from F

p r T F p l

 0

P

e l lies on all the epipolar lines of the left image

P l Epipolar Plane P r p r T F e l

 0

For every p r Epipolar Lines pl

F is not identically zero

F e l

 0

O l e l e r Epipoles

• Input: Fundamental Matrix F – Find the SVD of F

F

UDV

T

– The epipole e l is the column of V corresponding to the null singular value (as shown above) – The epipole e r is the column of U corresponding to the null singular value • Output: Epipole e l and e © 2005 Yusuf Akgul r

p r O r

Stereo Rectification

P

 Stereo System with Parallel Optical Axes  Epipoles are at infinity  Horizontal epipolar lines

P l P r Y’ l p’ l Y’ r Z’ l

• Rectification

X’ l O l T X’ r O r

– Given a stereo pair, the intrinsic and extrinsic parameters, find the image transformation to achieve a stereo system of horizontal epipolar lines – A simple algorithm: Assuming calibrated stereo cameras © 2005 Yusuf Akgul

Z’ r

Stereo Rectification

© 2005 Yusuf Akgul

Rectification

© 2005 Yusuf Akgul

Stereo Rectification

• Algorithm – Rotate both left and right camera so that they share the same X axis : O r -O l = T – Define a rotation matrix Rrect for the left camera – Rotation Matrix for the right camera is RrectR T – Rotation can be implemented by image transformation

X’ l X l Y l p l O l T Z l P l R, T P P r X r Z r X l ’ = T, Y l ’ = X l ’xZ l , Z’ l = X l ’xY l ’ p r O r Y r

© 2005 Yusuf Akgul

Stereo Rectification

• Algorithm – Rotate both left and right camera so that they share the same X axis : O r -O l = T – Define a rotation matrix Rrect for the left camera – Rotation Matrix for the right camera is RrectR T – Rotation can be implemented by image transformation

X’ l X l Y l p l O l T Z l P l R, T P P r X r Z r X l ’ = T, Y l ’ = X l ’xZ l , Z’ l = X l ’xY l ’ p r O r Y r

© 2005 Yusuf Akgul

Stereo Rectification

• Algorithm – Rotate both left and right camera so that they share the same X axis : O r -O l = T – Define a rotation matrix Rrect for the left camera – Rotation Matrix for the right camera is RrectR T – Rotation can be implemented by image transformation

X’ l Y’ l p’ l O l T Z’ l P l R, T P P r X’ r T’ = (B, 0, 0), P’ r = P’ l – T’ Y’ r Z r O r

© 2005 Yusuf Akgul

Stereo Rectification

• Read your book on how to obtain the rotation matrix for the rectification.

© 2005 Yusuf Akgul

Epipolar Geometry: Summary

• Purpose – where to search correspondences

P

r T

R

T

T

P

l

 0 • Epipolar plane, epipolar lines, and epipoles – known intrinsic (f) and extrinsic (R, T) • co-planarity equation – known intrinsic but unknown extrinsic • essential matrix – unknown intrinsic and extrinsic • fundamental matrix

p r T Ep l

 0 • Rectification

p r T F p l

 0 – Generate stereo pair (by software) with parallel optical axis and thus horizontal epipolar lines © 2005 Yusuf Akgul

3D Reconstruction Problem

• What we have done – Correspondences using either correlation or feature based approaches – Epipolar Geometry from at least 8 point correspondences • Three cases of 3D reconstruction depending on the amount of a priori knowledge on the stereo system – Both intrinsic and extrinsic known - > can solve the reconstruction problem unambiguously by triangulation – Only intrinsic known -> recovery structure and extrinsic up to an unknown scaling factor – Only correspondences -> reconstruction only up to an unknown, global projective transformation © 2005 Yusuf Akgul

Reconstruction by Triangulation

• • Assumption and Problem – Under the assumption that both intrinsic and extrinsic parameters are known – Compute the 3-D location from their projections, pl and pr Solution – Triangulation : Two rays are known and the intersection can be computed – Problem: Two rays will not actually intersect in space due to errors in calibration and correspondences, and pixelization – Solution: find a point in space with minimum distance from both rays

O l

© 2005 Yusuf Akgul

p l P p r O r

Reconstruction by Triangulation

• g © 2005 Yusuf Akgul

How to Get the Stereo Params from Camera Params

• If we know the camera parameters of both cameras (both cameras are calibrated), then we can easily calculate the stereo parameters. T and R are the parameters of the stereo system © 2005 Yusuf Akgul

Reconstruction up to a Scale Factor

• Assume that intrinsic parameters of both cameras are known • Essential Matrix is known up to a scale factor (for example, estimated from the 8 point algorithm).

© 2005 Yusuf Akgul

Reconstruction up to a Scale Factor

• Assumption and Problem Statement – Under the assumption that only intrinsic parameters and more than 8 point correspondences are given – Compute the 3-D location from their projections, pl and pr, as well as the extrinsic parameters • Solution – Compute the essential matrix E from at least 8 correspondences – Estimate T (up to a scale and a sign) from E (=RS) using the orthogonal constraint of R, and then R • End up with four different estimates of the pair (T, R) – Reconstruct the depth of each point, and pick up the correct sign of R and T.

– Results: reconstructed 3D points (up to a common scale); – The scale can be determined if distance of two points (in space) are known © 2005 Yusuf Akgul

E EE T

Reconstruction up to a Scale

kSR

k

2

SRR T S T

k

2

Factor

SS T

    

k

2   (

T Y

2

k k

2

T X

2

T X

T Y T Z T Z

2 )

k

2  (

k T X

2 2

T X

T Y T Z

2 

k

2

T Y T Z

) 

k

2

T X T Z k

2  (

k T X

2 2

T Y

T Z T Y

2 )     

Trace

EE T E k t

 sgn   2

k

2 (

T X

2 

T Y

2 

T Z

2 )  2

k

2

t

2

t S R

 sgn

S

^

R

S

      0

T z T y

T z

0

T x

T y T x

0     ˆ ˆ

T

S

^

S

^

T

     1   

T

ˆ

X T

ˆ

X T

ˆ

X

2

T

ˆ

Y T

ˆ

Z

 1 

T

 ˆ

T

ˆ

Y X T T

ˆ

Y T

ˆ ˆ

Z Y

2 © 2005 Yusuf Akgul   1

T

ˆ

X T

ˆ

Y

T

ˆ

Z T

ˆ

Z T

ˆ

Z

2     We can get the components of T from this matrix easily.

Reconstruction up to a Scale Factor

     ˆ 1

T

ˆ

T

2 ˆ

T

3    

R

    

R

1

T R

2

T R

3

T

    Let

w i

E i

t

ˆ ,

i

 It can be proved that

R R

2

R

1 3   

w

1

w

2  

w

2

w

3 

w

3 

w

1

w

3 

w

1 

w

2 © 2005 Yusuf Akgul

Reconstruction up to a Scale Factor

We have two choices of

t

, (

t +

and

t )

because of sign ambiguity and two choices of

E,

(E + and E

-

).

This gives us four pairs of translation vectors and rotation matrices.

© 2005 Yusuf Akgul

Reconstruction up to a Scale Factor

E

ˆ

t

ˆ 1. Construct the vectors

w

, and compute R 2. Reconstruct the Z and Z’ for each point 3. If the signs of Z and Z’ of the reconstructed points are a) both negative for some point, change the sign of

t

ˆ and go to step 2.

b) c) different for some point, change the sign of each entry

E

ˆ both positive for all points, exit.

Z

f

( (

x

R

3 

x

R

3   

f f

R

1  

R

1  ) )

T T p t Z

  

f

 ( (

xR

3

xR

3  

fR

1

fR

1 ) ) ( )

T p

 © 2005 Yusuf Akgul

No Parameters Known

• Only correspondences -> reconstruction only up to an unknown, global projective transformation • Read 7.4.3 in your book © 2005 Yusuf Akgul