Introduction to Image Processing and Computer Vision Rahul Sukthankar

Download Report

Transcript Introduction to Image Processing and Computer Vision Rahul Sukthankar

Introduction to
Image Processing and
Computer Vision
Rahul Sukthankar
Intel Research Laboratory at Pittsburgh
and
The Robotics Institute, Carnegie Mellon
[email protected]
Image Processing vs.
Computer Vision
• Image processing:
 Image  image
 e.g., de-noising, compression, edge detection
• Computer vision:
 Image  symbols
 e.g., face recognition, object tracking
• Most real-world applications combine techniques
from both categories
Rahul Sukthankar
15-829 Lecture 4
Outline
•
•
•
•
•
Operations on a single image
Operations on an image sequence
Multiple cameras
Extracting semantics from images
Applications
Rahul Sukthankar
15-829 Lecture 4
Outline
•
•
•
•
•
Operations on a single image
Operations on an image sequence
Multiple cameras
Extracting semantics from images
Applications
Rahul Sukthankar
15-829 Lecture 4
What is an Image?
• 2D array of pixels
• Binary image (bitmap)
 Pixels are bits
• Grayscale image
 Pixels are scalars
 Typically 8 bits (0..255)
• Color images
 Pixels are vectors
 Order can vary: RGB, BGR
 Sometimes includes Alpha
Rahul Sukthankar
15-829 Lecture 4
What is an Image?
• 2D array of pixels
• Binary image (bitmap)
 Pixels are bits
• Grayscale image
 Pixels are scalars
 Typically 8 bits (0..255)
• Color images
 Pixels are vectors
 Order can vary: RGB, BGR
 Sometimes includes Alpha
Rahul Sukthankar
15-829 Lecture 4
What is an Image?
• 2D array of pixels
• Binary image (bitmap)
 Pixels are bits
• Grayscale image
 Pixels are scalars
 Typically 8 bits (0..255)
• Color images
 Pixels are vectors
 Order can vary: RGB, BGR
 Sometimes includes Alpha
Rahul Sukthankar
15-829 Lecture 4
What is an Image?
• 2D array of pixels
• Binary image (bitmap)
 Pixels are bits
• Grayscale image
 Pixels are scalars
 Typically 8 bits (0..255)
• Color images
 Pixels are vectors
 Order can vary: RGB, BGR
 Sometimes includes Alpha
Rahul Sukthankar
15-829 Lecture 4
What is an Image?
• 2D array of pixels
• Binary image (bitmap)
 Pixels are bits
• Grayscale image
 Pixels are scalars
 Typically 8 bits (0..255)
• Color images
 Pixels are vectors
 Order can vary: RGB, BGR
 Sometimes includes Alpha
Rahul Sukthankar
15-829 Lecture 4
Canny Edge Detector
cvCanny(…)
Images courtesy of OpenCV tutorial at CVPR-2001
Rahul Sukthankar
15-829 Lecture 4
Morphological Operations
• Simple morphological operations on binary images:
 erosion: any pixel with 0 neighbor becomes 0
 dilation: any pixel with 1 neighbor becomes 1
• Compound morphological operations:
(composed of sequences of simple morphological ops)





opening
closing
morphological gradient
top hat
black hat
• Aside: what is the “right” definition of “neighbor”?
Rahul Sukthankar
15-829 Lecture 4
Morphological Operations
Image I
Erosion IB
Dilatation IB
Closing I•B= (IB)B Grad(I)= (IB)-(IB) TopHat(I)= I - (IB)
Images courtesy of OpenCV tutorial at CVPR-2001
Opening IoB= (IB)B
BlackHat(I)= (IB)-I
Rahul Sukthankar
15-829 Lecture 4
Hough Transform
Goal: Finding straight lines in an edge image
Original image
Images courtesy of OpenCV tutorial at CVPR-2001
Canny edge + Hough xform
cvHoughLines(…)
Rahul Sukthankar
15-829 Lecture 4
Distance Transform
• Distance for all non-feature points to closest feature point
cvDistTransform(…)
Images courtesy of OpenCV tutorial at CVPR-2001
Rahul Sukthankar
15-829 Lecture 4
Flood Filling
cvFloodFill(…) grows from given seed point
Images courtesy of OpenCV tutorial at CVPR-2001
Rahul Sukthankar
15-829 Lecture 4
Image Statistics
• Statistics are used to summarize the pixel values in a region, typically
before making a decision
• Some statistics are computed over a single image:
 Mean and standard deviation: cvAvg(…), cvAvgSdv(…)
 Smallest and largest intensities: cvMinMaxLoc(…)
 Moments: cvGetSpatialMoment(…), cvGetCentralMoment(…)
• Others are computed over pairs/differences of images:
 Distances/norms C, L1, L2: cvNorm(…), cvNormMask(…)
 Others are computed over pairs/differences of images:
• Histograms:
 Multidimensional histograms: (many functions to create/manipulate)
 Earth mover distance – compare histograms: cvCalcEMD(…)
Rahul Sukthankar
15-829 Lecture 4
Image Pyramids:
Coarse to Fine Processing
• Gaussian and Laplacian
pyramids
• Image segmentation by
pyramids
Images courtesy of OpenCV tutorial at CVPR-2001
Rahul Sukthankar
15-829 Lecture 4
Image Pyramids:
Coarse to Fine Processing
Original image
Images courtesy of OpenCV tutorial at CVPR-2001
Gaussian
Laplacian
Rahul Sukthankar
15-829 Lecture 4
Pyramid-based Color Segmentation
Images courtesy of OpenCV tutorial at CVPR-2001
Rahul Sukthankar
15-829 Lecture 4
Outline
•
•
•
•
•
Operations on a single image
Operations on an image sequence
Multiple cameras
Extracting semantics from images
Applications
Rahul Sukthankar
15-829 Lecture 4
Background Subtraction
• Useful when camera is still and background is static or
slowly-changing (e.g., many surveillance tasks)
• Basic idea: subtract current image from reference image.
Regions with large differences correspond to changes.
• OpenCV supports several variants of image differencing:
 Average
 Standard deviation
 Running average: cvRunningAvg(…)
• Can follow up with connected components (segmentation):
 could use “union find” or floodfill: cvFloodFill(…)
Rahul Sukthankar
15-829 Lecture 4
Optical Flow
• Goal: recover apparent motion vectors between a
pair of images -- usually in a video stream
• Several optical flow algorithms are available:




Block matching technique: cvCalcOpticalFlowBM(…)
Horn & Schunck technique: cvCalcOpticalFlowHS(…)
Lucas & Kanade technique: cvCalcOpticalFlowLK(…)
Pyramidal LK algorithm:
cvCalcOpticalFlowPyrLK(…)
Rahul Sukthankar
15-829 Lecture 4
Active Contours:
Tracking by Energy Minimization
• Snake energy:
E  Eint  Eext
• Internal energy: Eint  Econt  Ecurv
• External energy: Eext  Eimg  Econ
Eimg   I ,
Eimg   grad ( I ) ,
E    Econt    Ecurv    Eimg  min
cvSnakeImage(…)
Images courtesy of OpenCV tutorial at CVPR-2001
Rahul Sukthankar
15-829 Lecture 4
Camera Calibration
•
•
Real cameras exhibit radial &
tangential distortion: causes
problems for some algorithms.
First, calibrate by showing a
checkerboard at various
orientations:
cvFindChessBoardCornerGuesses()
•
Then apply an undistorting warp to
each image (don’t use a warped
checkerboard!)
cvUndistort(…)
•
If the calibration is poor, the
“undistorted” image may be worse
than the original.
Images courtesy of OpenCV tutorial at CVPR-2001
Rahul Sukthankar
15-829 Lecture 4
Outline
•
•
•
•
•
Operations on a single image
Operations on an image sequence
Multiple cameras
Extracting semantics from images
Applications
Rahul Sukthankar
15-829 Lecture 4
Stereo Vision
• Extract 3D geometry from
multiple views
• Points to consider:
 feature- vs area-based
 strong/weak calibration
 processing constraints
• No direct support in
OpenCV, but building
blocks for stereo are there.
Rahul Sukthankar
15-829 Lecture 4
View Morphing
Images courtesy of OpenCV tutorial at CVPR-2001
Rahul Sukthankar
15-829 Lecture 4
Outline
•
•
•
•
•
Operations on a single image
Operations on an image sequence
Multiple cameras
Extracting semantics from images
Applications
Rahul Sukthankar
15-829 Lecture 4
Face Detection
Images courtesy of Mike Jones & Paul Viola
Rahul Sukthankar
15-829 Lecture 4
Classical Face Detection
Large
Scale
Small
Scale
Images courtesy of Mike Jones & Paul Viola
Painful!
Rahul Sukthankar
15-829 Lecture 4
Viola/Jones Face Detector
• Technical advantages:
 Uses lots of very simple box features, enabling an
efficient image representation
 Scales features rather than source image
 Cascaded classifier is very fast on non-faces
• Practical benefits:
 Very fast, compact footprint
 You don’t have to implement it!
(should be in latest version of OpenCV)
Rahul Sukthankar
15-829 Lecture 4
Principal Components Analysis
High-dimensional data
Lower-dimensional subspace
cvCalcEigenObjects(…)
Images courtesy of OpenCV tutorial at CVPR-2001
Rahul Sukthankar
15-829 Lecture 4
PCA for Object Recognition
Images courtesy of OpenCV tutorial at CVPR-2001
Rahul Sukthankar
15-829 Lecture 4
PCA for Object Recognition
Images courtesy of OpenCV tutorial at CVPR-2001
Rahul Sukthankar
15-829 Lecture 4
Outline
•
•
•
•
•
Operations on a single image
Operations on an image sequence
Multiple cameras
Extracting semantics from images
Applications
Rahul Sukthankar
15-829 Lecture 4
Examples of Simple Vision Systems
Shadow Elimination
• Idea: remove shadows
from projected displays
using multiple projectors
• OpenCV Techniques:




Image differencing
Image warping
Convolution filters
Matrix manipulation
PosterCam
• Idea: put cameras in
posters and identify who
reads which poster
• OpenCV Techniques:
 Face detection
 Face recognition
 Unsupervised clustering
Rahul Sukthankar
15-829 Lecture 4
Single Projector: Severe Shadows
display screen
P
Rahul Sukthankar
15-829 Lecture 4
Two Projectors: Shadows Muted
display screen
P-1
P-2
Rahul Sukthankar
15-829 Lecture 4
Dynamic Shadow Elimination
display screen
camera
P-1
P-2
Rahul Sukthankar
15-829 Lecture 4
Shadow Elimination: Challenges
display screen
camera
P-1
•
•
•
•
P-2
Occlusion detection: what does a shadow look like?
Geometric issues: which projectors are occluded?
Photometric issues: how much light removes a shadow?
Performance: how can we do this in near real-time?
Rahul Sukthankar
15-829 Lecture 4
Shadow Elimination: Solutions
display screen
camera
P-1
•
•
•
•
P-2
Occlusion detection: difference image analysis
Geometric issues: single shadow-mask for all projectors!
Photometric issues: uncalibrated – feedback system
Performance: only modify texture map alpha values
Rahul Sukthankar
15-829 Lecture 4
Shadow Removal with
a Single Mask
Rahul Sukthankar
15-829 Lecture 4
Shadow Elimination Algorithm
Camera images
Projected
Rahul Sukthankar
15-829 Lecture 4
PosterCam Overview
• PosterCam Hardware:
 Camera in each poster
 Embedded computer in
each poster (~ iPAQ)
 Network connection to
other posters
Rahul Sukthankar
15-829 Lecture 4
PosterCam Details
• Face detection:
Viola/Jones (no float ops)
• Lighting compensation:
histogram equalization
• Pose variation:
additional synthetic faces
• Unsupervised clustering:
k-means and nearest
neighbor with nonstandard distance metric
Rahul Sukthankar
15-829 Lecture 4
Tips on Image Processing and
Coding with OpenCV
• Use the OpenCV documentation only as a guide
(it is inconsistent with the code)
• Read cv.h before writing any code
• OpenCV matrix functions work on images: e.g., cvSub(…)
• Beware camera distortion: cvUnDistort(…) may help
• Beware illumination changes:
 disable auto gain control (AGC) in your camera if you are doing
background subtraction
 histogram equalization (often good for object recognition)
• Image processing algorithms may require parameter
tuning: collect data and tweak until you get good results
Rahul Sukthankar
15-829 Lecture 4
Reference Reading
• Digital Image Processing
Gonzalez & Woods,
Addison-Wesley 2002
• Computer Vision
Shapiro & Stockman,
Prentice-Hall 2001
• Computer Vision: A Modern Approach
Forsyth & Ponce,
Prentice-Hall 2002
• Introductory Techniques for 3D Computer Vision
Trucco & Verri,
Prentice-Hall 1998
Rahul Sukthankar
15-829 Lecture 4
The End
Acknowledgments
• Significant portions of this lecture were derived from the Intel
OpenCV tutorial by Gary Bradski et al. at CVPR-2001
• Thanks to my former colleagues at Compaq/HP CRL for additional
slides and suggestions: Tat-Jen Cham, Mike Jones, Vladimir Pavlovic,
Jim Rehg, Gita Sukthankar, Nuno Vasconcelos, Paul Viola
Contact [email protected] if you need more information
Rahul Sukthankar
15-829 Lecture 4