Detecting and Segmenting Objects for Mobile Manipulation

Transcript Detecting and Segmenting Objects for Mobile Manipulation

OpenCV Tutorial
Omri Perez
Adapted from:
Gary Bradski
Senior Scientist, Willow Garage
Consulting Professor: Stanford CS Dept.
http://opencv.willowgarage.com
www.willowgarage.com
11
•
•
•
•
Vision is Hard
Camera Model, Lens, Problems and Corrections
OpenCV
OpenCV Tour
CS324
2
Vision is Hard
• What is it?
– Turning sensor readings into perception.
• Why is it hard?
– It’s just numbers.
Maybe try gradients to find edges?
CS324
3
Use Edges? … It’s not so simple
• Depth discontinuity
• Surface orientation
discontinuity
• Reflectance
discontinuity (i.e.,
change in surface
material properties)
• Illumination
discontinuity (e.g.,
shadow)
Slide credit: Christopher Rasmussen
CS324
4
Must deal with Lighting Changes …
CS324
5
Lighting is also a Strong Cue
Gary Bradski (c) 2008
66
The Brain Assumes 3D Geometry
Perception is ambiguous … depending on your point of view!
77
Geometrical aberrations
Non-Geometrical
aberrations
q spherical distortion
q Chromatic
q astigmatism
q Vignetting
q tangential distortion
q coma
aberrations are reduced
by combining lenses
These are typically what
are corrected for in
camera Calibration
8
Marc Pollefeys
Distortion Correction so that Lens
can Approximate a Pinhole Camera
• Distortions are corrected mathematically
– We use a calibration pattern
• We find where the points ended up
• We know where the points hould be
• OpenCV 2.2 Function:
double calibrateCamera(
const vector<vector<Point3f> >& objectPoints,
const vector<vector<Point2f> >& imagePoints,
Size imageSize,
Mat& cameraMatrix,
Mat& distCoeffs,
vector<Mat>& rvecs,
vector<Mat>& tvecs,
int flags=0);
CS324
9
•
•
•
•
Vision is Hard
Camera Model, Lens, Problems and Corrections
OpenCV
OpenCV Tour
CS324
10
OpenCV Overview:
opencv.willowgarage.com
Robot support
> 2000 algorithms
Image Pyramids
General Image Processing Functions
Geometric
descriptors
Camera
calibration,
Stereo, 3D
Segmentation
Features
Utilities and
Data Structures
Transforms
Tracking
Machine
Learning:
Fitting
• Detection,
• Recognition
Matrix Math
Gary Bradski
11
OpenCV Tends Towards Real Time
http://opencv.willowgarage.com
Where is OpenCV Used?
•
•
•
•
•
•
•
•
•
Google Maps, Google street view, Google Earth, Books
Academic and Industry Research
Safety monitoring (Dam sites, mines, swimming pools)
Security systems
Image retrieval
• Well over 2M downloads
Video search
Structure from motion in movies
Machine vision factory production inspection systems
Robotics
2M downloads
Screen shots by Gary Bradski, 2005
OpenCV Modules
• Calib3d
– Calibration, stereo, homography, rectify, projection, solvePNP
• Contrib
– Octree, self-similar feature, sparse L-M, bundle adj, chamfer match
• Core
– Data structures, access, matrix ops, basic image operations
• features2D
– Feature detectors, descriptors and matchers in one architecture
• Flann (Fast library for approximate nearest neighbors)
• Gpu – CUDA speedups
• Highgui
– Gui to read, write, draw, print and interact with images
•
•
•
•
Imgproc – image processing functions
Ml – statistical machine learning, boosting, clustering
Objdetect – PASCAL VOC latent SVM and data reading
Traincascade – boosted rejection cascade
CS324
14
Software Engineering
• Works on:
– Linux, Windows, Mac OS (+ Android since open CV 2.2)
• Languages:
– C++, Python, C
• Online documentation:
– Online reference manuals: C++, C and Python.
•
•
•
•
Vision is Hard
Camera Model, Lens, Problems and Corrections
OpenCV
OpenCV Tour
CS324
16
Gradients: Scharr instead of Sobel
• Sobel has been the traditional
3x3 gradient finder.
• Use the 3x3 Scharr operator
instead since it is just as fast
but has more accurate
response on diagonals.
void Scharr(const Mat& src, Mat& dst,
int ddepth, int xorder, int yorder, double
scale=1, double delta=0, int
borderType=BORDER_DEFAULT)
CS324
17
Canny Edge Detector
Canny()
OpenCV team, Gary Bradski
18
Hough Transform
HoughCircles(), HoughLines(), HoughLinesP() (probabilistic Hough)
Gary Bradski (c) 2008
19
Scale Space
void cvPyrDown(
IplImage* src,
IplImage* dst,
IplFilter filter = IPL_GAUSSIAN_5x5);
void cvPyrUp(
IplImage* src,
IplImage* dst,
IplFilter filter = IPL_GAUSSIAN_5x5);
Gary Bradski (c) 2008
20
Space Variant vision: Log-Polar Transform
cvLogPolar(src,dst,center,size, CV_INTER_LINEAR+CV_WARP_FILL_OUTLIERS)
Gary Bradski (c) 2008
21
Delaunay Triangulation, Voronoi Tessellation
CvSubdiv2D* cvCreateSubdivDelaunay2D(CvRect rect, CvMemStorage* storage)
Gary Bradski (c) 2008
22
Contours
void findContours()
Gary Bradski (c) 2008
23
Histogram Equalization
void equalizeHist(const Mat& src, Mat& dst)
Gary Bradski (c) 2008
24
Image textures
• Inpainting:
• Removes damage to images, in this case, it removes the text.
void inpaint(const Mat& src, const Mat& inpaintMask, Mat& dst, double
inpaintRadius, int flags);
Gary Bradski (c) 2008
25
Morphological Operations Examples
• Morphology - applying Min-Max. Filters and its combinations
Void morphologyEx()
Image
I
Erosion IB
Closing I•B= (IB)B Grad(I)= (IB)-(IB)
createMorphologyFilter()
erode()
dilate()
Dilatation IB
Opening IoB= (IB)B
TopHat(I)= I - (IB) BlackHat(I)= (IB) - I
Gary Bradski (c) 2008
26
Distance Transform
• Distance field from edges of objects
void
distanceTransform(c
onst Mat& src, Mat&
dst, int
distanceType, int
maskSize)
int floodFill(Mat&
image, Point seed,
Scalar newVal, Rect*
rect=0, Scalar
loDiff=Scalar(), Scalar
upDiff=Scalar(), int
flags=4)
Flood Filling
Gary Bradski (c) 2008
27
Thresholds
void adaptiveThreshold()
double threshold()
Gary Bradski (c) 2008
28
Segmentation
• Pyramid, mean-shift, graph-cut
• Here: Watershed
void watershed(const Mat& image, Mat& markers)
Gary Bradski (c) 2008
29
Background Subtraction
BackgroundSubtractorMOG2(), see samples/cpp/bgfg_segm.cpp
Gary Bradski (c) 2008
30
Image Segmentation & Minimum Cut
Pixel
Neighborhood
Image
Pixels
w
Similarity
Measure
Minimum
Cut
31
* From Khurram Hassan-Shafique CAP5415 Computer Vision 2003
GrabCut
void grabCut(const Mat& image, Mat& mask, Rect rect, Mat& bgdModel, Mat& fgdModel, int iterCount, int mode)
• Graph Cut based segmentation
Gary Bradski
32
Motion Templates (My work with James Davies)
•
•
•
•
Object silhouette
Motion history images
Motion history gradients
Motion segmentation algorithm
silhouette
MHI
MHG
Gary Bradski (c) 2008
33
Segmentation, Motion Tracking
void updateMotionHistory();
void calcMotionGradient();
double calcGlobalOrientation();
Motion
Segmentation
Motion
Segmentation
Pose
Recognition
Gesture
Recognition
Gary Bradski (c) 2008
James Davies, Gary Bradski
34
Tracking with CAMSHIFT
• Control game with head
RotatedRect CamShift(const
Mat& probImage, Rect&
window, TermCriteria criteria)
3D tracking
• Camera Calibration
• View Morphing
• POSIT
void POSIT()
A more general technique for solving pose is
solving the Percpective N Point problem:
void solvePnP(…)
Mean-Shift for Tracking
CamShift();
MeanShift();
Gary Bradski (c) 2008
37
Optical Flow
// opencv/samples/c/lkdemo.c
int main(…){
…
CvCapture* capture = <…> ?
cvCaptureFromCAM(camera_id) :
cvCaptureFromFile(path);
if( !capture ) return -1;
for(;;) {
IplImage* frame=cvQueryFrame(capture);
if(!frame) break;
// … copy and process image
cvCalcOpticalFlowPyrLK( …)
cvShowImage( “LkDemo”, result );
c=cvWaitKey(30); // run at ~20-30fps
speed
calcOpticalFlowPyrLK()
if(c >=
{
Also
see0)dense
optical flow:
// process key
calcOpticalFlowFarneback()
}}
cvReleaseCapture(&capture);}
I ( x  dx, y  dy, t  dt)  I ( x, y , t );
 I / t  I / x  ( dx / dt)  I / y  ( dy / dt);
G  X  b,
X  (x, y ), G 

 I x2 , I x I y 

, b 
2
I x I y , I y 



I x 
It  
I y 
Features 2D
Read two input images:
Mat img1 = imread(argv[1], CV_LOAD_IMAGE_GRAYSCALE);
Detect keypoints in both images:
// detecting keypoints
FastFeatureDetector detector(15);
vector<KeyPoint> keypoints1;
detector.detect(img1, keypoints1);
Compute descriptors for each of the keypoints:
// computing descriptors
SurfDescriptorExtractor extractor;
Mat descriptors1;
extractor.compute(img1, keypoints1, descriptors1);
Now, ﬁnd the closest matches between descriptors from the ﬁrst image to the second:
// matching descriptors
BruteForceMatcher<L2<float> > matcher;
vector<DMatch> matches;
matcher.match(descriptors1, descriptors2, matches);
CS324
39
Features 2D continued …
Viusalize the results
namedWindow("matches", 1);
Mat img_matches;
drawMatches(img1, keypoints1, img2, keypoints2,
matches, img_matches);
imshow("matches", img_matches);
waitKey(0);
Find the homography transformation between two sets of points:
vector<Point2f> points1, points2;
// fill the arrays with the points
....
Mat H = findHomography(Mat(points1), Mat(points2), CV_RANSAC, ransacReprojThreshold);
Create a set of inlier matches and draw them.
Use perspectiveTransform function to map points with homography:
Mat points1Projected;
perspectiveTransform(Mat(points1), points1Projected, H);
Use drawMatches() again for drawing inliers.
CS324
40
Detection:
Features2d contents
Detectors available
• SIFT
• SURF
• FAST
• STAR
• MSER
• GFTT (Good Features To Track)
Description:
Descriptors available
• SIFT
• SURF
• One way
• Calonder (under construction)
• FERNS
Kalman Filter, Partical Filter for Tracking
Kalman
::KalmanFilter class
Condensation or Particle Filter
ConDensation
Gary Bradski (c) 2008
42
Projections
Find:
Warp:
Mat getAffineTransform()
void warpAffine()
Mat getPerspectiveTransform()
void warpPerspective()
Homography
• Maps one plane to another
– In our case: A plane in the world to the camera plane
– Great notes on this: Robert Collins CSE486
• http://www.cse.psu.edu/~rcollins/CSE486/lecture16.pdf
– Derivation details: Learning OpenCV 384-387
Perspective Matrix Equation
(camera coords Pt in world to pt on image)
Gary Bradski and Adrian Kaehler: Learning OpenCV
Gary Bradski, CS223A, Into to Robotics
X 
 x '   f 0 0 0  Y 
 y '   0 f 0 0  Z 
  
 
 z '   0 0 1 0  1 
 
 
p  M int PC
X
Z
X
y f
Z
x f
44
Homography
• We often use the chessboard detector to find 4 non-colinear
points
– (X,Y * 4 = 8 constraints)
– To solve for the 8 homography parmeters.
• Code: Once again, OpenCV makes this easy
– findHomography(…) or:
– getPerspectiveTransform(…)
Gary Bradski, CS223A, Into to Robotics
45
Single Camera Calibration
See samples/cpp/calibration.cpp
Now, camera calibration can be done by holding
checkerboard in front of the camera for a few seconds.
And after that you’ll get:
3D view of checkerboard
Gary Bradski (c) 2008
Un-distorted image
46
Stereo … Depth from Triangulation
• Involved topic, here we will just skim the basic
geometry.
• Imagine two perfectly aligned image planes:
Depth “Z” and disparity “d” are inversly related:
47
Stereo
• In aligned stereo, depth is from similar triangles:
T  ( xl  x r ) T
fT
 Z  l
Z f
Z
x  xr
• Problem: Cameras are almost impossible to align
• Solution: Mathematically align them:
48
All: Gary Bradski and Adrian Kaehler: Learning OpenCV
Stereo Rectification
• Algorithm steps are shown at right:
• Goal:
– Each row of the image contains the same world points
– “Epipolar constraint”
Result: Epipolar alignment of features:
49
All: Gary Bradski and Adrian Kaehler: Learning OpenCV
samples/c
In ...\opencv_incomp\samples\c
bgfg_codebook.cpp
bgfg_segm.cpp
learning engine
blobtrack.cpp
calibration.cpp
camshiftdemo.c
simple color tracking
contours.c
convert_cascade.c
recognition
convexhull.c
delaunay.c
demhist.c
recognition
dft.c
distrans.c
drawing.c
edge.c
facedetect.c
ffilldemo.c
find_obj.cpp
fitellipse.c
houghlines.c
image.cpp
CvImage();
inpaint.cpp
kalman.c
kmeans.c
laplace.c
- Use of a image value codebook
for background detection for
collecting objects
- Use of a background
- Engine for blob tracking in images
- Camera Calibration
- Use of meanshift in
- Demonstrates how to compute and use
object contours
- Change the window size in a
cascade
- Find the convex hull of an object
- Triangulate a 2D point cloud
- Show how to use histograms for
- Discrete fourier transform
- distance map from edges in an image
- Various drawing functions
- Edge detection
- Face detection by classifier cascade
- Flood filling demo
- Demo use of SURF features
- Robust elipse fitting
- Line detection
- Shows use of new image class,
- Texture infill to repair imagery
- Kalman filter for trackign
- K-Means
- Convolve image with laplacian.
letter_recog.cpp
lkdemo.c
minarea.c
morphology.c
Close
motempl.c
silhouettes)
mushroom.cpp
decision trees (CART)
pyramid_segmentation.c
squares.c
squares
stereo_calib.cpp
disparity
watershed.cpp
- Example of using machine learning
Boosting,
Backpropagation (MLP) and
Random forests
- Lukas-Canada optical flow
- For a cloud of points in 2D, find min
bounding box and circle.
Shows use of Cv_SEQ
- Demonstrates Erode, Dilate, Open,
- Demonstrates motion templates
(orthogonal optical flow given
- Demonstrates use of
for recognition
- Color segmentation in pyramid
- Uses contour processing to find
in an image
- Stereo calibration, recognition and
map computation
- Watershed transform demo.
50
samples/cpp Code of Possible use for Projects
•
Brief_match_test
– Use of fast det., brief descrp. ORB will
replace. See video_homography.cpp
•
•
•
Calibration (single camera)
Chamfer (2D edge matching)
Connected_components
– Using contours to clean up regions in
images.
•
•
•
•
Contours2 (finding and drawing)
Convexhull (finding in 2D)
Cout_mat – (print out Mat)
Demhist using calcHist()
– histograms and histogram
normalization
•
•
•
•
•
•
•
•
Ffilldemo (flood fill methods)
Filestorage (I/O of data structs)
Fitellipse (find contours, fit ellispe)
Grabcut (energy based segmentation)
Imagelist_creator (yaml or xml lists)
•
•
•
•
•
•
•
•
Descriptor_extractor_matcher
– Use of features 2D detector descriptor
•
– Also see matcher_simple.cpp
•
Distrans
•
– Use of the distanceTransform on edge
•
images and voroni tessel.
CS324•
Edge (Canny edge detection)
Read using: starter_imagelist.cpp
Kalman (Using the kalman filter)
Kinect_maps (using kinect in OpenCV)
Kmeans (using kmeans clustering)
Laplace (finding points/edges)
Letter_recog (machine learning)
•
Use of Random trees, boosting, MLP
Lkdemo (Lukas Kanada optical flow)
Morphology2 (erosion, dilation etc)
Multicascadeclassifier (rejection cascade)
Peopledetect (use of HOG)
Select3dobj (calc R and t from calib)
Stereo_* (stereo calib. and matching)
Watersed (segmentation algorithm) 51
ML for Recognition
Gary Bradski (c) 2008
52
Machine Learning Library (MLL)
CLASSIFICATION / REGRESSION
(new) Fast Approximate NN (FLANN)
(new) Extremely Random Trees
(coming) LSH
CART
Naïve Bayes
MLP (Back propagation)
Statistical Boosting, 4 flavors
Random Forests
SVM
Face Detector
(Histogram matching)
(Correlation)
CLUSTERING
K-Means
EM
(Mahalanobis distance)
AACBAABBCBCC
AACACB
CCB
CC
CBABBC
AAA
B
CB
C
ABBC
B
A
BBC
BB
C
TUNING/VALIDATION
Cross validation
Bootstrapping
Variable importance
Sampling methods
http://opencv.willowgarage.com
53
53
K-Means, Mahalanobis
K-Means:
•Choose K data points as cluster centers
• While cluster centers change:
• Assign each data point to the closest center
• If a cluster has no points, chose a random point from
points far away from other cluster centers
• Move the centers to the mean position of points in their
cluster
double kmeans() double Mahalanobis()
Gary Bradski (c) 2008
54
Patch Matching
void matchTemplate()
Gary Bradski (c) 2008
55
Gesture Recognition
double compareHist()
Gestures:
Up
R
L
Stop OK
Gesture via:
Gradient histogram*
based gesture
recognition with
Tracking.
Meanshift Algorithm
used to track,
histogram
intersection with
gradient used to
recognize.
*Bill Freeman
Gary Bradski (c) 2008
56
Boosting: Face Detection with
Viola-Jones Rejection Cascade
In samples/cpp, see:
Multicascadeclassifier.cpp
Gary Bradski (c) 2008
57
Machine learning
• Good features often beat good algorithms
• Choose an operating point that trades off accuracy vs.
cost
Gary Bradski (c) 2008
TP
FN 100%
FP
TN100%
58
Some project ideas: (feel free to steal, modify or ignore)
1. Identify faces in (cellphone) pictures using facebook as database.
2. Use the (cellphone) camera to detect dangerous road events and or detect when
someone is awake or sleeping (even with sunglasses on?) also in low light
conditions.
3. Use webcam/cellphone to take pictures or videos of a room and then generate the
floor plan.
4. Photograph or video a Jenga tower, and advise the player which is the safest block
to remove.
5. Make a multiplayer game (if possible more than one computers/ cameras) based on
CV.
6. Make an intuitive two handed UI for the OS (extra points for adding the use of facial
gestures).
7. Do something with kinect (e.g. a golf game)
8. For engineers: make a paintball turret (e.g.
http://www.paintballsentry.com/Videos.htm).
9. Make a security system with multiple cameras that records high quality portrait
images and low quality video and alerts the presence suspicious people in real time
(e.g. covered faces).
10. Use the camera to cheat/gain an advantage in real life interactions (sports, gambling)
11. Make a system (on the cellphone) that identifies/ classifies photographed objects (for
59
example mushrooms)
Questions?
Useful OpenCV Links
OpenCV Wiki:
http://opencv.willowgarage.com/wiki
User Group (44700 members 4/2011):
http://tech.groups.yahoo.com/group/OpenCV/join
OpenCV Code Repository:
svn co https://code.ros.org/svn/opencv/trunk/opencv
New Book on OpenCV:
http://oreilly.com/catalog/9780596516130/
Or, direct from Amazon:
http://www.amazon.com/Learning-OpenCV-Computer-VisionLibrary/dp/0596516134
Code examples from the book:
http://examples.oreilly.com/9780596516130/
Documentation
http://opencv.willowgarage.com/documentation/index.html
61
61

Detecting and Segmenting Objects for Mobile Manipulation

Transcript Detecting and Segmenting Objects for Mobile Manipulation

Directory