HDICAD Hand Drawn Input for CAD Systems

Download Report

Transcript HDICAD Hand Drawn Input for CAD Systems

Introduction to
IPL and OpenCV libraries
Bogdan Raducanu
Centre de Visiò per Computador
E-mail: [email protected]
Cover Story
OpenCV was of key use in the vision system of "Stanley"
What is IPL?
- The IPL (Image Processing Library ) is a collection of functions
implementing several image processing algorithms. It was developed by
INTEL.
- Is optimized for MMX and different processor types (there is a DLL for
each type of INTEL processor)
- The images are stored in a specific data structure. In order to work with the image,
we need to know the information contained in the structure header
(contains image specific characteristics)
Image structure
IplImage *
IplImage
• Width
• Height
• Bits per pixel
• Channel sequence
•…
• pointer to image data
Data
The “IplImage” structure includes a header containing image information and attributes:
- nChannels: number of channels
(1 for grayscale, 3 for RGB, 4 for CMYK, ...)
- depth: number of bits/pixel and data type
- IPL_DEPTH_1U (1-bit)
- IPL_DEPTH_8U (8-bit unsigned)
- IPL_DEPTH_8S (8-bit signed)
- IPL_DEPTH_16U (16-bit unsigned)
- IPL_DEPTH_16S (16-bit signed)
- IPL_DEPTH_32S (32-bit signed)
- IPL_DEPTH_32F (32-bit float)
- colorModel: “GRAY”, “RGB”, “CMYK”, etc.
- channelSeq: “GRAY”, “BGR”, “BGRA”, “RGB”, “RGBA”, “HSV”, “YUV”, etc.
- dataOrder: RGBRGBRGB... or RRR...GGG...BBB
- origin: Top-Left or Bottom-Left (IPL_ORIGIN_TL ó IPL_ORIGIN_BL)
- scanline alignment: DWORD or QWORD
- width (in pixels)
- height (in pixels)
- ROI: (could be NULL)
- maskROI: (could be NULL)
- imageSize: image size (in bytes)
- imageData: pointer to pixel data
Function categories in IPL:
- create/destroy an image and access its content
- arithmetical/logical operations
- filtering
- morphological operations
- color space conversion
- histogram
- linear/geometrical transformations
- image statistics
Create/destroy an image and access its content:
- iplCreateImageHeader
- iplAllocateImage
- iplCreateROI
- iplSetROI
- iplCopy
- iplClone
- iplDeallocateImage
- iplPutPixel
- iplGetPixel
Example (create/destroy an image):
#include ”ipl.h”
…
IplImage *img = iplCreateImageHeader(
3, 0, IPL_DEPTH_8U, “RGB”, “BGR”, IPL_DATA_ORDER_PIXEL,
IPL_ORIGIN_TL, IPL_ALIGN_QWORD, 150, 100, NULL, NULL, NULL, NULL );
iplAllocateImage(img, 0, 0);
/////// Use the image ////////
iplDeallocate(img, IPL_IMAGE_ALL);
• We created an image of 150x100 pixels with 3 channels. The color model is
RGB and data type is 8 bits/pixel, unsigned. The channel order is
BGRBGRBGR… starting from the upper row. The data are aligned in memory as
QWORD (64 bits). There is no ROI defined
• We allocated memory for data, but without initialize it.
• With the IPL_IMAGE_ALL parameter, we freed the header of the structure, the
data and the existing ROIs (if any)
- The content of an IplImage can be accessed in several ways:
- using the functions GetPixel and PutPixel:
-Inconvenience: slow access
- go directly to the memory address corresponding to the pixel al pixel:
- Inconvenience:
•this operation is ‘complex’ because we have to compute the
memory address beforehand
•careful with the data types
- Advantage:
•is faster to access a big chunk of data
- Example: Let’s assume we have a RGB image with 8 bits/pixel, unsigned:
#include ”ipl.h”
…
IplImage *img = iplCreateImageHeader(
3, 0, IPL_DEPTH_8U, “RGB”, “BGR”, IPL_DATA_ORDER_PIXEL,
IPL_ORIGIN_TL, IPL_ALIGN_QWORD, 150, 100, NULL, NULL, NULL,
NULL );
iplAllocateImage(img, 0, 0);
unsigned char *R,*G,*B;
B = (unsigned char *) img->imageData;
G = B+1;
R = G+1;
for (int i=0; i<15000; i++, R+=3, G+=3, B+=3) {
///// here we can use/modify the pixel data
}
iplDeallocate(img, IPL_IMAGE_ALL);
Remarks:
- in order to be able to use the IPL functions, you must include the header file “ipl.h” in your source code
- include the “ipl.lib” in the project settings
- the structures’ names use the “Ipl” (“I” uppercase) prefix, meanwhile the functions’ names use the “ipl” (“i”
lowercase) prefix
- the remaining function categories will be presented in the section dedicated to OpenCV library
- for more information, consult the IPL manual:
C:\Archivos de Programa\Intel\plsuite\doc\iplman.pdf
OpenCV library
Why OpenCV?
- IPL is a “low-level” library (it allows basic operations)
- OpenCV is a library which contains more complex data structures and “high-level” functions: optical flow, pattern
recognition 2D-3D real-time tracking, camera calibration, etc.
- it comes with some extensions that allow:
- accessing a camera or working with AVI files
- graphical user interface (“HighGUI”) allowing a faster and easier way
to interact/visualize the images
OpenCV is in general compatible with the IPL library. It is based also on the “IplImage” structure. But it must be
employed taking into account the following restrictions:
- the image statistics functions require that “IplImage” be defined either with a single channel or three channels of
the following data types: IPL_DEPTH_8U, IPL_DEPTH_8S or IPL_DEPTH_32F.
- OpenCV supports only interleaved images
- the following attributes: colorModel, channelSeq, BorderMode, and BorderConst are ignored
- the attibutes maskROI and tileInfo must be set to 0.
- the ROIs of the input and output image must be the same.
Remark: the structures’ name uses the “Cv” (“C” uppercase) prefix, meanwhile the functions’ name uses the “cv” (“c”
lowercase) prefix.
Create/Destroy an image and access its content
- cvCreateImage
- cvCreateImageHeader
- cvReleaseImageHeader
- cvReleaseImage
- cvCreateImageData
- cvReleaseImageData
- cvSetImageROI
- cvCopyImage
- cvCloneImage
- cvGetImageRawData
#include ”ipl.h”
#include “cv.h”
#include “cxcore.h”
…
IplImage *img = cvCreateImage(cvSize(150, 100), IPL_DEPTH_8U, 3);
/////// Use the image ////////
cvReleaseImage(&img);
Remark: In some cases (when we work with images captured from the
camera), would be more convenient to initialize the IplImage structure using
the IPL functions.
Arithmetical/Logical operations
- Arithmetical operations:
- unary; cvAddS, cvSubS, ...
- binary: cvAdd, cvMul, cvSub, cvMatMulAdd, cvInvert, ...
- Logical operations
- unary: cvAndS, cvOrS, cvXorS, ...
- binary: cvAnd, cvOr, cvXor, ...
Remark:
Most of the OpenCV functions are defined to support both the IplImage and
CvMat data types. That’s possible because of the definition of CvArr data
type:
typedef void CvArr;
The OpenCV functions look-up for the first integer of the structure being
passed, in order to distinguish between the two data types. In the case
of IplImage, this integer is equal to the size of IplImage structure,
meanwhile it is equal to 0x4224xxxx in the case of CvMat.
Image filtering
- Based on convolution with fixed kernel:
- cvLaplace, cvSobel, cvSmooth
Morphological operations
- cvErode, cvDilate
- cvMorphologyEx (advanced operations: opening, closing, Top-Hat, etc.)
- user-defined structuring elements:
- cvCreateStructuringElementEx
- cvReleaseStructuringElement
Example: application of the erosion function
Color space conversion
- cvCvtColor allows the following color-space conversions:
- CV_RGB2GRAY
- CV_RGB2HSV
- CV_RGB2YCrCb
Histogram
- cvCreateHist, cvReleaseHist
- cvCalcHist, cvCopyHist
- cvCompareHist, cvThreshHist
- cvGetMinMaxHistValue, cvNormalizeHist
Linear/geometrical transformations
- cvFFT (Fast Fourier Transform)
- cvDCT (Discrete Cosine Transform)
- cvResize, cvMirror, cvConvertScale
Feature extraction
- cvCanny (border detection)
- cvHoughLines (line detection)
- cvFindCornerSubPix (corner detection)
Example of border detection
Image statistics
- cvNorm (C- , L1- and L2-norm)
- cvMoments (spatial and central moments)
- cvMinMaxLoc (find the min/max values)
Drawing functions
- cvLine, cvRectangle
- cvCircle, cvEllipse
- cvPolyLine, cvFillPoly
- cvInitFont, cvPutText
Motion analysis
- estimators: Kalman, Condensation
- cvKalmanXX, cvCondensXX
- movement patterns
- cvCalcMotionGradient, cvMotionHistoryUpdate
- optical flow
- cvCalcOpticalFlowPyrLK (implements the Lucas-Kanade method based on pyramidal decomposition)
- tracking
- cvMeanShift, cvCamShift, cvSnakeImage
- background substraction
3D reconstruction
- camera calibration
- cvCalibrateCamera, cvFindExtrinsecCameraParams, cvUnDistort
- hand detection
- cvFindHandRegion
- pose estimation
- cvPOSIT
- finding pixel correspondence in a pair of stereo images
- cvFindStereoCorrespondence
Example of distorsion correction
Object detection (faces)
The object detection algorithm implemented in OpenCV is based on the following papers:
- Paul Viola and Michael J. Jones. Rapid Object Detection using a Boosted Cascade of Simple Features. IEEE CVPR,
2001
and
- Rainer Lienhart and Jochen Maydt. An Extended Set of Haar-like Features for Rapid Object Detection. IEEE ICIP
2002, Vol. 1, pp. 900-903, Sep. 2002.
Idea
A classifier (namely a cascade of boosted classifiers working with haar-like features)
is trained with a few hundreds of sample views of a particular object (i.e., a face or a car), called positive examples,
that are scaled to the same size (say, 20x20), and negative examples - arbitrary images of the same size.
- The weak classifier outputs a "1" if the region is likely to show the object of interest (i.e., face/car), and "0"
otherwise
- The classifier is designed so that it can be easily "resized" in order to be able to find the objects of interest at
different sizes, which is more efficient than resizing the image itself.
- The word "cascade" in the classifier name means that the resultant classifier consists of several simpler classifiers
(stages) that are applied subsequently to a region of interest until at some stage the candidate is rejected or all
the stages are passed.
Data structures and functions implemented in OpenCV for face detection
- CvHaarClassifierCascade (structure representing a cascade of classifiers)
- cvLoadHaarClassifierCascade (reads from a file a cascade of classifiers and stores it in the CvHaarClassifierCascade
structure). The cascade is stored in a XML file format:
C:\Archivos de programa\Intel\OpenCV\data\haarcascades\...)
- cvHaarDetectObjects (detects the objects in the image)
- cvReleaseHaarClassifierCascade (frees the memory occupied by the CvHaarClassifierCascade structure)
HighGUI library
Allows a fast and easy interaction/visualization with/of images
- Reading an image from a file
- cvLoadImage (const char* filename, int iscolor CV_DEFAULT(1));
- Writing an image to a file
- cvSaveImage (const char* filename, const CvArr* image);
It supports several formats: BMP, GIF, JPG, TIFF, etc.
- Open a window for visualization
- cvNamedWindow (const char* name, int flags);
- Visualize the image
- cvShowImage (const char* name, const CvArr* image);
The structure of OpenCV
The ‘cv.h’ contains:
- general image processing functions: filter, color conversion, morphological operators, structural analysis, motion
analysis, pattern recognition (object detection), camera calibration and 3D reconstruction
The ‘cxcore.h’ contains:
- basic structures, arithmetical/logical operators (copy, transformation), dynamic structures (sets, graphs, trees), drawing
functions, error handling and system functions
The ‘cvaux.h’
- stereo correspondence, texture descriptors, 2D-3D trackers, background segmentation, morphing, etc.
The ‘highgui.h’ contains:
- user interface
Installation of IPL/OpenCV libraries and environment settings for MSVC++
In order to keep track with the following settings, OpenCV folder and IPL folder (named ‘plsuite’) must be
installed in C:\Archivos de Programa\Intel\
- in the ‘c’ source file add the header files needed: ‘ipl.h’, ‘cv.h’, ‘cxcore.h’, ‘highgui.h’, ‘cvaux.h’, ‘cvhaartraining.h’
- from the ‘Project’ menu, choose ‘Settings’ and click on the ‘Link Tab’. In the field: ‘Object/library modules’ add
the following:
ipl.lib cv.lib cxcore.lib highgui.lib cvaux.lib cvhaartraining.lib (if you use the face detector functions)
- from the menu ‘Tools’ choose ‘Options’ and then click on the ‘Directories’ tab. In the field ‘Show directories for’ choose
‘Include files’, then edit the following paths:
- C:\ARCHIVOS DE PROGRAMA\INTEL\PLSUITE\INCLUDE
- C:\ARCHIVOS DE PROGRAMA\INTEL\OPENCV\CV\INCLUDE
- C:\ARCHIVOS DE PROGRAMA\INTEL\OPENCV\OTHERLIBS\HIGHGUI
- C:\ARCHIVOS DE PROGRAMA\INTEL\OPENCV\CXCORE\INCLUDE
- C:\ ARCHIVOS DE PROGRAMA\INTEL\OPENCV\apps\HaarTraining\include
(if you are using the face detector)
- from the menu ‘Tools’ choose ‘Options’ and then click on the ‘Directories’ tab. In the
field ‘Show directories for’ choose ‘Library files’, then edit the following paths:
- C:\ARCHIVOS DE PROGRAMA\INTEL\PLSUITE\LIB\MSVC
- C:\ARCHIVOS DE PROGRAMA\INTEL\OPENCV\LIB
- the last step consists in adding the paths for DLLs in the ‘Environment Variables’
- from ‘Control Panel’, choose ‘System’, then ‘Advanced’ and finally click on ‘Environment Variable’. From dialog
box that appears, in the ‘System Variables’ section, click and edit the ‘PATH’ item
- you have to add the following route:
C:\Archivos de programa\Intel\OpenCV\bin
C:\Archivos de programa\Intel\plsuite\bin
Examples
Three examples are included (provided in separate files):
- first is about getting a live stream from a webcam (you must make sure you have DirectX library installed
- second is about detecting faces in an image by using the detector which comes with the OpenCV library
- third is about image convolution
More information...
- http://www.cvc.uab.es/~bogdan/CV/cv.html
- IPL/OpenCV online documentation (html, pdf files)
- C:\Archivos de programa\Intel\OpenCV\samples
- http://www.site.uottawa.ca/~laganier/tutorial/opencv+directshow/cvision.htm
- http://groups.yahoo.com/group/OpenCV/
(you have to register)