Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, http://www.cs.cmu.edu/~hws) © Copyright 2002 Michael G.

Download Report

Transcript Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, http://www.cs.cmu.edu/~hws) © Copyright 2002 Michael G.

Image Processing
and Computer Vision
Lecture 4, Multimedia E-Commerce Course
November 5, 2002
Mike Christel
(significant input by Henry Schneiderman,
http://www.cs.cmu.edu/~hws)
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
Carnegie Mellon
Outline
•
Defining Image Processing and Computer Vision
•
Emerging Technology
• Digitization of documents
• Digitization of images/photographs
• Biometrics
• Management of images on computers
• Other: manufacturing, military, games, …
Research in Image Processing and Computer Vision
•
•
•
Automatically Finding Faces and Cars
Content-based Image Retrieval
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
2
Carnegie Mellon
Image Processing vs. Computer Vision
• Image Processing
• Research area within electrical engineering/signal
processing
• Focus on syntax,
low level features
image
image
• Computer Vision
• Research area within computer science/artificial
intelligence
• Focus on semantics,
symbolic or geometric
descriptions
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
image
3
Faces
People
Chairs
etc.
Carnegie Mellon
Optical Character Recognition (OCR)
• First patent in OCR in 19th century
• First applications in post-office and banks
• Documents easier to distribute, search, organize, and
edit in digital form
• Typewriter has been replaced by word processor
• Lots of legacy materials (the world’s libraries of books)
available only in print
• State of the art not perfect, but 99% accurate on cleanly
printed pages
• Examples of errors. . .
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
4
Carnegie Mellon
Heavy Print
Output from 3 commercial OCR systems
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
5
Carnegie Mellon
Light Print
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
6
Carnegie Mellon
Stray Marks
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
7
Carnegie Mellon
Typography
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
8
Carnegie Mellon
Processing Overlaid Text in Video
Text Area
Video
The Video OCR
(VOCR) process used
by the Informedia
research group at
Carnegie Mellon
Detection
Text Area
Preprocessing
Commercial
OCR
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
9
ASCII Text
Carnegie Mellon
Text Area Detection
Video Frames
Filtered Frames
AND-ed Frames
(1/2 s intervals)
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
Carnegie Mellon
VOCR Preprocessing Problems
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
12
Carnegie Mellon
Augmenting VOCR with Dictionary Look-up
Handwriting Recognition
• Natural progression to OCR work for print
• Works if constraints on writer, e.g. palm pilot, where
user is asked to conform to specific style or convention
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
14
Carnegie Mellon
Other Document Processing
• Not just for text. . .
• Examples:
• Engineering document to CAD file
• Maps to GIS format
• Music score to MIDI representation
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
15
Carnegie Mellon
Outline
•
Defining Image Processing and Computer Vision
•
Emerging Technology
• Digitization of documents
• Digitization of images/photographs
• Biometrics
• Management of images on computers
• Other: manufacturing, military, games, …
Research in Image Processing and Computer Vision
•
•
•
Automatically Finding Faces and Cars
Content-based Image Retrieval
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
16
Carnegie Mellon
Digital Cameras = Convenience
• Easy to capture photos
• Easy to store and organize photos
• Easy to duplicate photos
• Easy to edit photos
• Rough Multimedia eCommerce class survey:
•
•
•
•
1999:
2000:
2001:
2002:
10% own digital cameras
25%
50%
??
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
17
Carnegie Mellon
Digital Camera Cautions
Via “Photo Industry Reporter” e-Magazine at:
http://www.photoreporter.com/2002/1021/photokina_report_look_at_35mm.html
• Film cameras still outsell digital cameras by almost
three to one
• The household penetration of digital is at about 15%
• “But let’s face it: film’s days are numbered. Anyone
staying solely with film these days will have a glorious
buggy whip in a market that will be clamoring for cars.”
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
18
Carnegie Mellon
Digital Camera Growth
• Photo Marketing Association on US digital camera
sales:
•
•
•
•
4.5 million in 2000
6.9 million in 2001
Projected 9.3 million for 2002
http://www.visioneer.com/About/press/june2402.html
• InfoTrends Research Group estimates that the U.S.
photo-enabled TV set-top installed base will grow from
less than 1 million units in 2002, to over 114 million
units in 2006. Household penetration will climb from
under 1% to around 85%.
• InfoTrends projects digital camera sales to grow at a
rate of 38% through 2003
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
19
Carnegie Mellon
State of the Art: Digital Cameras
• Film is currently better in resolution and color
• Professional photographers
• Digital for low quality newspaper advertisements
• Film for portrait photos
• Computer storage limitations: 1 high resolution digital image = 2025 Megabytes
• http://pic.templetons.com/brad/photo/pixels.html
• 3500 line pairs/35 mm or about 5000 dots/inch, but grainy
• At 3:2 frame size, ~20 million pixels
• Conclusion: “a 5300 x 4000 digital camera would produce a
shot equivalent to a scan from a quality 35mm camera -provided you can get more than 8 bits per pixel. …A 3000 x
2000 digital camera would match the 35mm for a good
percentage of shots.”
• Printing: home printers not comparable to commercial printers
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
20
Carnegie Mellon
Future of Digital Cameras
• Improved resolution and color
• “Smart” cameras
• More programmable features
• Auto-focus on object of interest
• “Everything in focus” photo
• Capture photo when event X occurs
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
21
Carnegie Mellon
Outline
•
Defining Image Processing and Computer Vision
•
Emerging Technology
• Digitization of documents
• Digitization of images/photographs
• Biometrics
• Management of images on computers
• Other: manufacturing, military, games, …
Research in Image Processing and Computer Vision
•
•
•
Automatically Finding Faces and Cars
Content-based Image Retrieval
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
22
Carnegie Mellon
Biometrics
• Technology for
identification
• Finger/palm print
• Iris
• Face
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
23
Carnegie Mellon
Fingerprints
• Minutae – spits and merges of ridges
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
24
Carnegie Mellon
Face Identification
• Not quite reliable yet.
• Performance degrades rapidly with uncontrolled
lighting, facial expression, and size of database
• Several companies exist:
•
•
•
•
•
•
•
Visionics (Rockfeller University spin-off)
Viisage (MIT spin-off)
EyeMatic (USC spin-off)
Miros (MIT spin-off)
Banque-Tec Intl (Australia)
C-VIS Computer Vision (Germany)
LAU Technologies
• Commercial systems installed in London and Brazil to
catch criminals
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
25
Carnegie Mellon
Automatic Age Progression
Original Image
(1962)
Computer-Aged
(1997)
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
26
Actual Photo
(1997)
Carnegie Mellon
Outline
•
Defining Image Processing and Computer Vision
•
Emerging Technology
• Digitization of documents
• Digitization of images/photographs
• Biometrics
• Management of images on computers
• Other: manufacturing, military, games, …
Research in Image Processing and Computer Vision
•
•
•
Automatically Finding Faces and Cars
Content-based Image Retrieval
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
27
Carnegie Mellon
Management of images on computers
• Compression – reducing
storage size needed for images
• Watermarking – Protecting
copyright
• Microsoft, Bell Labs, NEC, etc.
Visible watermark
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
28
Carnegie Mellon
Photo Manipulation
• Adobe Photoshop, Corel
PhotoPaint, Pixami, PhotoIQ,
etc.
• Image editing: crop an image,
adjust the color, paint over part
of any image, airbrush part of
an image, combine images,
etc.
• Future: Applications of
computer vision, e.g.,
discriminating foreground from
background.
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
29
Carnegie Mellon
Online Digital Image Collections
• Stock photos of use to graphic designers, artists, etc.
• Large collections of images exist
• Corbis 67 million images
• Getty 70 million stock photography images
• AP collects 1000s of digitized images per day
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
30
Carnegie Mellon
Outline
•
Defining Image Processing and Computer Vision
•
Emerging Technology
• Digitization of documents
• Digitization of images/photographs
• Biometrics
• Management of images on computers
• Other: manufacturing, military, games, …
Research in Image Processing and Computer Vision
•
•
•
Automatically Finding Faces and Cars
Content-based Image Retrieval
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
31
Carnegie Mellon
Inspection for Manufacturing
• Occum – inspection of printed circuit boards ($100M /
year)
• Cognex – Do-it-yourself toolkits for inspection (400
employees)
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
32
Carnegie Mellon
Automatic Target Recognition (ATR)
• Finding mines, tanks, etc.
• Billion dollar a year industry
• Martin-Lockheed, TSR, Northrup-Grumman, other
aerospace contractors.
• Various types of imagery:
• Synthetic Aperture Radar (SAR), Sonar, hyper-spectral
imagery (more than 3 colors)
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
33
Carnegie Mellon
Aerial Photo Interpretation
• Also referred to as “automated cartography”
• Classification of land-use: forest, vegetation, water
• Identification of man-made objects: buildings, roads,
etc.
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
34
Carnegie Mellon
Better Security Cameras
• Cameras that are responsive to the environment
• Track and zoom on moving objects
• Automatic adjustment of contrast
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
35
Carnegie Mellon
Medical imagery
• Medical image libraries for study and diagnosis
• Image overlay to guide surgeons
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
36
Carnegie Mellon
History
• 1980’s ~100 companies – manufacturing applications
mostly
• Early 1990’s less than 10 companies
• Late 1990’s ~100 companies – face recognition,
intelligent teleconferencing, inspection, digital libraries,
medical imaging
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
37
Carnegie Mellon
Outline
•
Defining Image Processing and Computer Vision
•
Emerging Technology
• Digitization of documents
• Digitization of images/photographs
• Biometrics
• Management of images on computers
• Other: manufacturing, military, games, …
Research in Image Processing and Computer Vision
•
•
•
Automatically Finding Faces and Cars
Content-based Image Retrieval
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
38
Carnegie Mellon
Image Processing: Filtering
Enhancing an image’s quality for human viewing, e.g., in
medical imaging or in telescopic views of space
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
39
Carnegie Mellon
Image Processing: Compression
• Lossless – No loss in quality: gif, tiff
• Lossy – Original image cannot be reconstructed: jpeg
• New work on advancing lossy compression strategies
with fewer visual artifacts: JPEG 2000 and wavelet
transformations
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
40
Carnegie Mellon
Image Processing: Watermarking
• Information hiding
• Protecting copyright
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
41
Carnegie Mellon
Image Processing: Transformation
• Transforming image can make it easier to analyze
Wavelet transform of image
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
42
Carnegie Mellon
Wavelet Coefficients
Horizontal LP,
Vertical LP
Horizontal HP,
Vertical LP
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
43
Horizontal LP,
Vertical HP
Horizontal HP,
Vertical HP
Carnegie Mellon
5/3 Linear Phase Wavelets
Linear phase 5/3: c[n] = {-1, 2,6,2,-1}, d[n]={1,-2,1}
g[n] = {1, 2,-6,2, 1}, f[n]={1, 2,1}
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
44
Carnegie Mellon
Computer Vision: 3D Shape Reconstruction
• Use images to build 3D model of object or site
3D site model built from
laser range scans
collected by CMU
autonomous helicopter
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
45
Carnegie Mellon
Computer Vision: Guiding Motion
• Visually guided
manipulation
• Hand-eye
coordination
• Visually guided
locomotion
• robotic vehicles
CMU NavLab II
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
46
Carnegie Mellon
Computer Vision: Recognition & Classification
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
47
Carnegie Mellon
Challenges in Object Recognition
245 267 234 142 22 28 38
121 156 187 98 73 32 12
123 21 21 38 209 237 121
99 87 59 197 216 244
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
48
Carnegie Mellon
Object Recognition Research
Large Quantity of Data
Quality/Quantity Issues
Robust
Algorithms
Intraclass
Object
Variation
Segmentation and
Hierarchical Analysis
Face
Lips
Hand
Gesture
Text
Object
Clock Detection License
Plate
Building Vehicle Automated
Advanced
Learning
Image
Enhancement
Large
number of
Object
Classes
Low Image Quality
Object Detection Issues
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
49
Carnegie Mellon
Intra-Class Variation
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
50
Carnegie Mellon
Lighting Variation
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
51
Carnegie Mellon
Geometric Variation
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
52
Carnegie Mellon
Simpler Problem: Classification
• Fixed size input
• Fixed object size, orientation, and alignment
“Object is present”
(at fixed size and alignment)
Decision
“Object is NOT present”
(at fixed size and alignment)
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
53
Carnegie Mellon
Detection: Apply Classifier Exhaustively
Search in position
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
54
Search in scale
Carnegie Mellon
View-based Classifiers
Face
Classifier #1
Face
Classifier #2
Face
Classifier #3
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
55
Carnegie Mellon
1) Apply Local Operators
f1(0, 0) = #5710
f1(0, 1) = #3214
fk(n, m) = #723
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
56
Carnegie Mellon
2) Look Up Probabilities
P1( #5710, 0, 0 | obj) = 0.53
f1(0, 0) = #5710
P1( #5710, 0, 0 | non-obj) = 0.56
P1( #3214, 0, 1 | obj) = 0.57
f1(0, 1) = #3214
P1( #3214, 0, 1 | non-obj) = 0.48
fk(n, m) = #723
Pk( #723, n, m | obj) = 0.83
Pk( #723, n, m | non-obj) = 0.19
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
57
Carnegie Mellon
3) Make Decision
P1( #5710, 0, 0 | obj) = 0.53
P1( #5710, 0, 0 | non-obj) = 0.56
P1( #3214, 0, 1 | obj) = 0.57
P1( #3214, 0, 1 | non-obj) = 0.48
0.53 * 0.57 * . . . * 0.83
>l
0.56 * 0.48 * . . . * 0.19
Pk( #723, n, m | obj) = 0.83
Pk( #723, n, m | non-obj) = 0.19
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
58
Carnegie Mellon
Two Classifiers Trained for Faces
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
59
Carnegie Mellon
Eight Classifiers Trained for Cars
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
60
Carnegie Mellon
Probabilities Estimated Off-Line
f1(0, 0) = #567
H1(#567, 0, 0) = H1(567, 0, 0) + 1
H1(#567, 0, 0)
P1(#567, 0, 0) =
fk(n, m) = #350
S H1(#i, 0, 0)
Hk(#350, 0, 0) = Hk(#350, 0, 0) + 1
Hk(#350, 0, 0)
Pk(#350, 0, 0) =
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
61
S Hk(#i, 0, 0)
Carnegie Mellon
Training Classifiers
• Cars: 300-500 images per viewpoint
• Faces: 2,000 images per viewpoint
• ~1,000 synthetic variations of each original image
• background scenery, orientation, position, frequency
• 2000 non-object images
• Samples selected by bootstrapping
• Minimization of classification error on training set
• AdaBoost algorithm (Freund & Shapire ‘97, Shapire & Singer
‘99)
• Iterative method
• Determines weights for samples
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
62
Carnegie Mellon
Web-based Demo of Face Detector
http://www.vasc.ri.cmu.edu/cgi-bin/demos/findface.cgi
CMU Face Detector in Commercial Product
CMU Face Detector
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
68
Carnegie Mellon
Applications of Face Detection
• Automatic red-eye removal from photographs
• Automatic color balancing in photo-finishing
• Intelligent teleconferencing
• Component in face identification system
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
69
Carnegie Mellon
Difficulty Increases with Complexity of Object
• 2D vs. 3D
• Specific objects – e.g. my coffee mug
• A category of objects – e.g. all coffee mugs
• Amount of intra-category variation
• Rigid or semi-rigid structure, e.g. face
• Articulated objects, e.g. human body
• Functionally defined objects, e.g. chairs
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
70
Carnegie Mellon
Outline
•
Defining Image Processing and Computer Vision
•
Emerging Technology
• Digitization of documents
• Digitization of images/photographs
• Biometrics
• Management of images on computers
• Other: manufacturing, military, games, …
Research in Image Processing and Computer Vision
•
•
•
Automatically Finding Faces and Cars
Content-based Image Retrieval
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
71
Carnegie Mellon
Find Images With Similar Colors
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
72
Carnegie Mellon
Find Images with Similar Shape
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
73
Carnegie Mellon
Goal: Find Images with Similar Content
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
74
Carnegie Mellon
Spectrum of Content-Based Image Retrieval
Degree of difficulty
Similar color distribution
Histogram matching
Similar texture pattern
Texture analysis
Similar shape/pattern
Image Segmentation,
Pattern recognition
Similar real content
Life-time goal :-)
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
75
Carnegie Mellon
Status of Image Search
• Typical Search Features
•
•
•
•
Color
Texture
Shape
Spatial attributes (local color regions, less common than
global color, texture, shape metrics)
• Commercial Activity
• eVision (notes that “visual search engine market segment
is projected to reach $1.4 billion by 2005 according to the
McKenna Group” http://www.evisionglobal.com/about/index.html
• Virage (www.virage.com)
• IBM (QBIC part of database toolset)
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
76
Carnegie Mellon
Reference: “A Review of CBIR”
Recommended reading:
A Review of Content-Based Image Retrieval Systems
Colin C. Venters and Dr. Matthew Cooper, University of
Manchester
Available at http://www.jisc.ac.uk/jtap/htm/jtap-054.html
This review lists features from a number of image
retrieval systems, along with heuristic evaluations on
the interfaces for a subset of these systems.
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
77
Carnegie Mellon
Search Engines Used by 2001 Multimedia Class
• Search Engines used for 2001 multimedia retrieval
homework (15 others answered a single query each):
60
Queries Answered
50
40
30
20
10
G
oo
Al gle
ta
Vi
st
Ly a
co
Ya s
Al ho
lth o
ew
eb
C
NN
Fi Cor
nd bi
so s
un
d
3d s
ca
fe
Ex
Va ci
st te
Vi
d
V i eo
vis
i
M mo
am
m
a
0
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
78
Carnegie Mellon
Search Engines Used in This 2002 Class
50
45
Queries Answered
40
35
30
25
20
15
10
5
Vi
st
a
w
eb
.c
om
Ly
c
co os+
r
Si
ng bis.
co
in
gf
m
is
h.
co
G
m
et
+
ty
im
ag
e+
Ya
ho
o
W
eb
C
NN
sh
ot
s.
co
m
+
al
lth
e
Al
ta
G
oo
gl
e
0
Also answering 1 query each were: Excite+, Rexfeature, Webseek+,
search.netscape.com+, animalplanet.com+, ask.com, naver.com+
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
79
Carnegie Mellon
For Further Reading on Texture Search
• Texture Search: “Texture features for browsing and
retrieval of image data”, B.S. Manjunath and W.Y. Ma,
IEEE Trans. on Pattern Analysis and Machine
Intelligence 18(8), Aug. 1996, pp. 837-842.
• Texture search via
http://www.engin.umd.umich.edu/ceep/tech_day/2000/r
eports/ECEreport2/ECEreport2.htm (texture features
include coarseness, average gray scale value, and
number of horizontal and vertical extrema of a specific
image region)
• For QBIC, texture search works on global coarseness,
contrast and directionality features
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
80
Carnegie Mellon
For Further Exploration of Image Segmentation
• BlobWorld work at UC Berkeley
• Papers, description, sample system available at
http://elib.cs.berkeley.edu/photos/blobworld/
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
81
Carnegie Mellon
Further Reading on Wavelet Compression
and JPEG 2000
• http://www.gvsu.edu/math/wavelets/student_work/EF/howworks.html
• http://www-ise.stanford.edu/class/psych221/00/shuoyen/
• Henry Schneiderman Ph.D. Thesis “A Statistical Approach
to 3D Object Detection Applied to Faces and Cars”,
http://www.ri.cmu.edu/pub_files/pub2/schneiderman_henry
_2000_2/schneiderman_henry_2000_2.pdf
• http://www.jpeg.org/JPEG2000.html
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
82
Carnegie Mellon
Summary: Image Processing & Computer Vision
• Not as mature as speech recognition
• Technology not as reliable
• Fewer companies, fewer products
• Success on limited problems, e.g., documents
• More applicable to fault tolerant problems
• Technology will grow
• Emergence of digital camera
• Improved methods
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
83
Carnegie Mellon
Decomposition in Resolution/Frequency
coarse
intermediate
fine
intermediate
fine
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
84
Carnegie Mellon
Wavelet Decomposition
Vertical subbands (LH)
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
85
Carnegie Mellon
Wavelet Decomposition
Horizontal
subbands (HL)
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
86
Carnegie Mellon