Transcript Document

Real-Time Tracking
Axel Pinz
Image Based Measurement Group
EMT – Institute of Electrical Measurement
and Measurement Signal Processing
TU Graz – Graz University of Technology
http://www.emt.tugraz.at/~tracking
http://www.emt.tugraz.at/~pinz
[email protected]
Defining the Terms
• Real-Time
– Task dependent, “in-the-loop”
– Navigation: “on-time”
– Video rate: 30Hz
– High-speed tracking: several kHz
• Tracking
– DoF: Degrees of Freedom
– 2D: images, videos  2 / 3 DoF
– 3D: scenes, object pose  6 DoF
Example: High-speed, 2D
Applications
•
•
•
•
•
•
•
Surveillance
Augmented reality
Surgical navigation
Motion capture (MoCap)
Autonomous navigation
Telecommunication
Many industrial applications
Example: Augmented Reality
[ARToolkit, Billinghurst, Kato, Demo at ISAR2000, Munich]
http://www.hitl.washington.edu/research/shared_space/download/
Agenda
Structure of the SSIP Lecture
•
•
•
•
•
•
•
Intro, terminology, applications
2D motion analysis
Geometry
3D motion analysis
Practical considerations
Existing systems
Summary, conclusions
2D Motion Analysis
• Change detection
– Can be anything (not necessarily motion)
• Optical flow computation
– What is moving in which direction ?
– Hard in real time
• Data reduction required !
– Interest operators
– Points, lines, regions, contours
• Modeling required
– Motion models, object models
– Probabilistic modeling, prediction
Change Detection
[Pinz, Bildverstehen, 1994]
Optical Flow (1)
[Brox, Bruhn, Papenberg, Weickert]
ECCV04 best paper award
• Estimating the
displacement field
• Assumptions:
• Gray value constancy
• Gradient constancy
• Smoothness
• ...
• Error Minimization
Optical Flow (2)
[Brox, Bruhn, Papenberg, Weickert]
ECCV04 best paper award
!! Not in real-time !!
Interest Operators
• Reduce the amount of data
• Track only salient features
• Support region – ROI (region of interest)
Feature in ROI:
Edge / Line
Corner
Blob
Contour
2D Point Tracking
[Univ. Erlangen, VAMPIRE, EU-IST-2001-34401]
• Corner detection  Initialization
– Calculate “cornerness” c
– Threshold  sensitivity, # of corners
– E.g.: “Harris” / “Plessey” corners in ROI
  I x2  I x I y 

G  
2 
I
I
I
 x y  y 
2
c  det(G )   (trace(G ))
• Cross-correlation in ROI
2D Point Tracking
[Univ. Erlangen, VAMPIRE, EU-IST-2001-34401]
Edge Tracking
[Rapid 95, Harris, RoRapid 95, Armstrong, Zisserman]
Blob Tracking
[Mean Shift 03, Comaniciu, Meer]
Contour Tracking
[CONDENSATION 98-02, Isard, Toyama, Blake]
CONDENSATION (2)
•
•
•
•
•
CONditional DENSity propagATION
Requires a good initialization
Works with active contours
Maintains / adapts a contour model
Can keep more than one hypothesis
Agenda
Structure of the SSIP Lecture
•
•
•
•
•
•
•
Intro, terminology, applications
2D motion analysis
Geometry
3D motion analysis
Practical considerations
Existing systems
Summary, conclusions
Geometry
• Having motion in images:
– What does it mean?
– What can be measured?
•
•
•
•
Projective camera
Algebraic projective geometry
Camera calibration
Computer Vision
– Reconstruction from uncalibrated views
• There are excellent textbooks
[Faugeras 1994, Hartley+Zisserman 2001, Ma et al. 2003]
Projective Camera (1)
• Pinhole camera model:
– p = (x,y)T is the image of P = (X,Y,Z )T
– (x,y) ... image-, (X,Y,Z) ... scene-coordinates
– o ... center of projection
– (x,y,z) ... camera coordinate system
– (x,y,-f) ... image plane
x
x
f
y
p(x,y)
o
y
P(X,Y,Z)
z
X
Z
Y
Projective Camera (2)
• Pinhole camera model:
– If scene- = camera-coordinate system
X
f
o
x
X
xf ,
Z
Z
Y
yf
Z
Projective Camera (3)
• Frontal pinhole camera model:
– (x,y,+f) ... image plane
– Normalized camera: f=+1
x
o
y
f
P(X,Y,Z)
x
p
z
y
X
Z
Y
X
x f ,
Z
Y
y f
Z
Projective Camera (4)
• “real” camera:
 fsx

K 0
 0

– 5 Intrinsic parameters (K)
– Lens distortion
– 6 Extrinsic parameters (M: R, t)
X
x
 1 0 0 0  
 

 Y 
  y   K  0 1 0 0 M 
1 
 0 0 1 0  Z 
 

 1
 
–  … arbitrary scale
fs
fs y
0
u0 

v0 
1 
Algebraic Projective
Geometry [Semple&Kneebone 52]
• Homogeneous coordinates
• Duality points  lines
• Homography H describes any transformation
– E.g.: image  image transform: x’ = Hx
– All transforms can be described by 3x3 matrices
– Combination of transformations: Matrix product
 x'   1 0 a  x 
  
 
 y '    0 1 b  y 
1   0 0 1 1 
  
 
Translation
 x'   cos
  
 y '    sin 
1   0
  
 sin 
cos
0
Rotation
0  x 
 
0  y 
1 1 
Camera Calibration (1)
• Recover the 11 camera parameters:
– 5 Intrinsic parameters (K: fsx, fsy, fs, u0, v0)
– 6 Extrinsic parameters (M: R, t)
• Calibration target:
– At least 6 point correspondences 
– System of linear equations
• Direct (initial) solution for K and M
X
x
1
0
0
0
 

  
 

 Y 
y
~
K
0
1
0
0
 

M 
1 
 0 0 1 0  Z 
 

 1
 
Camera Calibration (2)
• Iterative optimization
– K, M, lens distortion
– E.g. Levenberg-Marquart
• Practical solutions require more points
– Many algorithms [Tsai 87, Zhang 98, Heikkilä 00]
• Overdetermined systems
• Robustness against outliers
– E.g. RANSAC
• Refer to [Hartley, Zisserman, 2001]
What can be measured ...
• … with a calibrated camera
– Viewing directions
– Angles between viewing directions
– 3D reconstruction: more than 1 view required
• … with uncalibrated camera(s)
– Computer Vision research of the past decade
– Hierarchy of geometries:
Projective – oriented projective – affine –
similarity – Euclidean
Agenda
Structure of the SSIP Lecture
•
•
•
•
•
•
•
Intro, terminology, applications
2D motion analysis
Geometry
3D motion analysis
Practical considerations
Existing systems
Summary, conclusions
3D Motion Analysis:
Location and Orientation
head
coord.
system
R
t
scene
coord.
system
6 DoF pose in real-time
Extrinsic parameters in real-time
3D Motion Analysis:
•
•
•
•
Tracking technologies, terminology
Camera pose (PnP)
Stereo, E, F, epipolar geometry
Model-based tracking
– Confluence of 2D and 3D
• Fusion
• Kalman Filter
Tracking Technologies (1)
•
•
•
•
•
•
Mechanical tracking
“Magnetic tracking”
Acoustic – time of flight
“Optical”  vision-based
Compass
GPS, …
External effort required !
No „self-contained“ system
[Allen, Bishop, Welch. Tracking: Beyond 15 minutes of thought. SIGGRAPH’01]
Tracking Technologies (2)
Examples
[Allen, Bishop, Welch. Tracking: Beyond 15 minutes of thought. SIGGRAPH’01]
Research at EMT:
Hybrid Tracking – HT
Combine 2 technologies:
• Vision-based
+ Good results for slow motion
– Motion blur, occlusion, wrong matches
• Inertial
+ Good results for fast motion
– Drift, noise, long term stability
• Fusion of complementary sensors !
Mimicks human cognition !
Vision-Based Tracking
More Terminology
•
•
•
•
•
•
Measure position and orientation in real-time
Obtain trajectories of object(s)
Moving observer, egomotion – “inside-out”
Stationary observer – “outside-in Tracking”
Combinations of the above
Degrees of Freedom – DoF
– 3 DoF (mobile robot)
– 6 DoF (head tracking in AR)
Inside-out Tracking
• monocular
• exterior parameters
• 6 DoF from  4 points
• wearable, fully mobile
corners
blobs
natural landmarks
Outside-in Tracking
stereo-rig
IR-illumination
• no cables
• 1 marker/device:
3 DoF
• 2 markers: 5 DoF
• 3 markers: 6 DoF
devices
Camera Pose Estimation
• Pose estimation: Estimate extrinsic parameters
from known / unknown scene  find R, t
• Linear algorithms [Quan, Zan, 1999]
• Iterative algorithms [Lu et al., 2000]
• Point-based methods
– No geometry, just 3D points
• Model-based methods
– Object-model, e.g. CAD
PnP(1)
Perspective n-Point Problem
• Calibrated camera K, C = (KKT)-1
• n point correspondences scene  image
• Known scene coordinates of pi, and known
distances dij = || pi – pj ||
• Each pair (pi,pj) defines an angle 
•  can be measured (2 lines of sight,
calibrated camera)
 constraint for the distance ||c – pi||
pi
dij
xi
pj
xj

c
PnP (2)
pi
xi  pi  c ,
searching:
constraint:
xj  pj  c
d ij2  xi2  x 2j  2 xi x j cos ij
dij
xi
xj
f ij ( xi , x j )  xi2  x 2j  2 xi x j cos ij  d ij2  0
calibratedcamera:
cos ij 
T
i
u Cu j
T
i
T
j
u Cui u Cu j
pj

c
uj ui
PnP (3)
• P3P, 3 points:
underdetermined, 4 solutions
f12 ( x1 , x2 )  0
f13 ( x1 , x3 )  0
f 23 ( x2 , x3 )  0
• P4P, 4 points:
overdetermined, 6 equations, 4 unknowns
1,2,3
1,2,4
4 x P3P, then find a common solution
1,3,4
2,3,4
• General problem: PnP, n points
PnP (4)
Once the xi have been solved:
1) project image points  scene
p’i = xi K-1 ui
2) find a common R, t for p’i  pi
(point-correspondences  solve a
simple system of linear equations)
Stereo Reconstruction
z
Elementary stereo geometry
in “canonical configuration”
P(x,y,z)
2 h … “baseline” b
Pz
Pr - Pl … “disparity” d
y
Cl
Cr
f
Pl
xl=0
x
There is just column disparity
Depth computation:
h
h
x=0
Pr
xr=0
bf
Pz 
d
Stereo (2)
• 2 cameras, general configuration:
Epipolar geometry
X
Y
ur
ul
Cl
v
ll
lr
el
er
Cr
Stereo (3)
• Uncalibrated cameras: Fundamental matrix F
• Calibrated cameras: Essential matrix E
• 3x3, Rank 2
F  (Kl1 )T SRKr 1  (Kl1 )T EK r 1
u Fu r  0
T
l
• Many Algorithms
– Normalized 8-point [Hartley 97]
– 5-point (structure and motion) [Nister 03]
Model-Based Tracking
Confluence of 2D and 3D [Deutscher, Davison, Reid 01]
3D Motion Analysis:
•
•
•
•
Tracking technologies, terminology
Camera pose (PnP)
Stereo, E, F, epipolar geometry
Model-based tracking
– Confluence of 2D and 3D
• Fusion
1. General considerations
• Kalman Filter 2. Kalman Filter
3.  EMT HT project
General Considerations
• We have:
– Several sensors (vision, inertial, ...)
– Algorithms to deliver pose streams for each sensor
(at discrete times; rates may vary depending on
sensor, processor, load, ...)
• Thus, we need:
– Algorithms for sensor fusion (weighting the
confidence in a sensor, sensor accuracy, ...)
– Pose estimation including a temporal model
[Allen, Bishop, Welch. Tracking: Beyond 15 minutes of thought. SIGGRAPH’01]
Sensor Fusion
• Dealing with ignorance:
– Imprecision, ambiguity, contradiction, ...
• Mathematical models
– Fuzzy sets
– Dempster-Shafer evidence theory
– Probability theory
• Probabilistic modeling in Computer Vision
– The topic of this decade !
– Examples: CONDENSATION, mean shift
Kalman Filter (1)
[Welch, Bishop. An Introduction to the Kalman Filter. SIGGRAPH’01]
http://www.cs.unc.edu/~welch/kalman
Kalman Filter (2)
• Estimate a process
xk  Axk 1  Buk  wk 1
• with a measurement
zk  Hxk  vk
xn ... State of the process
zm ... Measurement
p, v ... Process and measurement noise (zero mean)
A ... n x n Matrix relates the previous with the current time step
B ... n x l Matrix relates optional control input u to x
H ... n x m Matrix relates state x to measurement z
Kalman Filter (3)
• Definitions
xˆk  n ... a prioristateestimateat step k
xˆk  n ... a posterioristateestimateat step k given zk
ek  xk  xˆk ... a prioriestimateerror
ek  xk  xˆk ... a posterioriestimateerror
• Then:
Pk  E[ekekT ] ... a prioriestimateerror covariance
Pk  E[ek ekT ] ... a posterioriestimateerror covariance
Kalman Filter (4)
• Compute
xˆk  xˆk  K ( zk  Hxˆk )
( zk  Hxˆk ) ... " measurement innovation
"
K ... n  m " Kalman gain" thatminimizesPk
• with
Kk  Pk H T ( HPk H T  R) 1
R ... Measurement error covariance
Kalman Filter (5)
http://www.cs.unc.edu/~welch/kalman
Kalman Filter (6)
http://www.cs.unc.edu/~welch/kalman
EMT Hybrid Tracker
HT Project
Ingredients of hybrid tracking:
• Camera(s)
• Inertial sensors
• Feature extraction
• Pose estimation
• Structure estimation
• Real-time
• Synchronisation
• Kalman filter
• Sensor Fusion
Hybrid Tracking
• 6 DoF vision-based tracker
• 6 DoF inertial tracker
• Fusion by a Kalman filter
“Structure and Motion”
“Tracking” + “Structure from Motion”
Research Prototype
• Tracking subsystem
• Visualization subsystem
• Sensors + HMD
HT Application Example
Agenda
Structure of the SSIP Lecture
•
•
•
•
•
•
•
Intro, terminology, applications
2D motion analysis
Geometry
3D motion analysis
Practical considerations
Existing systems
Summary, conclusions
Practical Considerations
• There are critical configurations !
• Projective geometry vs. discrete pixels
– Rays do not intersect !
– Error minimization algorithms required
• Robustness (many points) vs. real-time
– Outlier detection can become difficult !
• Precision (iterative) vs. real-time (linear)
• Combination of diverse features
– Points, lines, curves
• Jitter, lag
• Debugging of a real-time system !
Existing Systems (1)
• VR/AR
– Intersense, Polhemus, A.R.T.
• MoCap
– Vicon, A.R.T.
• Medical tracking
– MedTronic, A.R.T.
• Fiducial tracker (Intersense)
• Research systems
– KLT (Kanade, Lucas, Tomasi)
– ARToolkit (Billinghurst)
– XVision (Hager)
Existing Systems (2)
Open Issues
• Tracking of natural landmarks
• First success in online structure and motion
– [Nister CVPR03, ICCV03, ECCV04]
• (Re-)Initialisation in highly complex scenes
• Usability !
Future Applications
• Can pose (position and orientation) be exploited ?
– What is the user looking at?
– Architecture, city guide, museum, emergency, …
• From bulky gear and HMD  PDA
– Wireless communication
– Camera(s)
– Inertial sensors (+ compass, + GPS, …)
• Automotive !
– Driver assistance
– Autonomous vehicles, mobile robot navigation, …
• Medicine !
– Surgical navigation
– Online fusion (temporal genesis, sensory modes, …)
Summary, Conclusions
•
•
•
•
•
Real-time pose (6 DoF)
2D and 3D motion analysis
Geometry
Probabilistic modeling
High potential for future developments
Acknowledgements
• EU-IST-2001-34401 Vampire Visual Active Memory Processes
and Interactive Retrieval
• FWF P15748 Smart Tracking
• FWF P14470 Mobile
Collaborative AR
• Christian Doppler Laboratory for
Automotive Measurement
Research
•
•
•
•
•
•
•
•
•
•
•
Markus Brandner
Harald Ganster
Bettina Halder
Jochen Lackner
Peter Lang
Ulrich Mühlmann
Miguel Ribo
Hannes Siegl
Christoph Stock
Georg Teichtmeister
Jürgen Wolf
Further Reading
R. Hartley, A. Zisserman. Multiple View Geometry in Computer Vision.
Cambridge University Press, 2nd ed., 2003.
Y. Ma, S. Soatto, J. Kosecka, S. Shankar Sastry. An Invitation to 3D Vision.
Springer, 2004.
B.D. Allen, G. Bishop, G. Welch. Tracking: Beyond 15 Minutes of Thought.
SIGGRAPH 2001, Course 11. See http://www.cs.unc.edu/~welch
G. Welch, G. Bishop. An Introduction to the Kalman Filter.
SIGGRAPH 2001, Course 8. See http://www.cs.unc.edu/~welch