Digital Video Library Informedia Interface Evaluation and Information Visualization December 10, 2002 Mike Christel.

Download Report

Transcript Digital Video Library Informedia Interface Evaluation and Information Visualization December 10, 2002 Mike Christel.

Digital Video Library
Informedia Interface Evaluation and
Information Visualization
December 10, 2002
Mike Christel
Outline
• Surrogates for Informedia Digital Video Library
• Abstractions for single video document
• Empirical studies on thumbnail images, skims
• Quick overview of early HCI investigations
• Summaries across video documents (collages)
• Demonstration of information visualization
• Required advances in automated content extraction
• TREC Video Retrieval Track 2002
• Overview of Carnegie Mellon participation and
results
• Multiple storyboard interface emphasizing imagery
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
2
Carnegie Mellon
Informedia Digital Video Library Project
• Initiated by the National Science Foundation, DARPA,
and NASA under the Digital Libraries Initiative, 1994-98
• Continued funding via Digital Libraries Initiative Phase 2
(NSF, DARPA, National Library of Medicine, Library of
Congress, NASA, National Endowment for the
Humanities)
• New work and directions via NSF, NSDL, ARDA VACE,
“Capturing, Coordinating, and Remembering Human
Experience” CCRHE Project, etc.
• Details at http://www.informedia.cs.cmu.edu/
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
3
Carnegie Mellon
Techniques Underlying Video Metadata
• Image processing
• Detection of text overlaid on video
• Detection of faces
• Identification of camera and object motion
• Breaking video into component shots
• Detecting corpus-specific categories, e.g., anchorperson
shots and weather map shots
• Speech recognition
• Text extraction and alignment
• Natural language processing
• Determining best text matches for a given query
• Identifying places, organizations, people
• Producing phrase summaries
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
4
Carnegie Mellon
Text and Face Detection
Text Extraction and Alignment
SILENCE
MUSIC
electric
cars
are
Text
extraction
they
are
the
jury
every
toy
owner
hopes
to
please
Raw audio
Raw video
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
6
Carnegie Mellon
Deriving “Matching Shots”
Shot Detection
Shot Frame
Extraction
Speech Recognition
and Alignment
0 These strange markings, preserved in the
clay of a Texas riverbed, are footsteps…
3500 dinosaur graveyards…
4600 group of scientists…
5930 nature’s special effects...
Align Words
to Shots
Matching Shot for
“dinosaur footprint”
Carnegie Mellon
Initial User Testing of Video Library, ca. 1996
• 104 hour library consisting of 3481 clips
• Average clip length of 1.8 minutes, consuming 15.7
megabytes of storage
• Automatic logs generated for usage of Informedia
Library by high school science teachers and students
• 243 hours logged (2473 queries, 2910 video clips
played)
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
8
Carnegie Mellon
Early Lessons Learned
• Titles frequently used,
should include length
and production date
• Results and title
placement affect usage
• Greater quantity of
video was desired
• Storyboards (filmstrips)
used infrequently
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
9
Carnegie Mellon
Empirical Study Into Thumbnail Images
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
10
Carnegie Mellon
Text-based Result List
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
11
Carnegie Mellon
“Naïve” Thumbnail List (Uses First Shot Image)
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
12
Carnegie Mellon
Query-based Thumbnail Result List
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
13
Carnegie Mellon
Query-based Thumbnail Selection Process
1. Decompose video segment into shots.
2. Compute representative frame for each shot.
3. Locate query scoring words (shown by arrows).
4. Use frame from highest scoring shot.
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
14
Carnegie Mellon
Thumbnail Study Results
1000
75
50
500
Time (secs.)
Titles Browsed
25
0
Text
First
0
Query
Text
400
9
300
7
200
Score (max =
400)
100
First
Query
5
1(terrible)9(wonderful)
3
0
1
Text
First
Query
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
Text
15
First
Query
Carnegie Mellon
Empirical Study Summary*
• Significant performance improvements for querybased thumbnail treatment over other two
treatments
• Subjective satisfaction significantly greater for querybased thumbnail treatment
• Subjects could not identify differences between
thumbnail treatments, but their performance
definitely showed differences!
_____
*Christel, M., Winkler, D., and Taylor, C.R. Improving Access to a
Digital Video Library. In Human-Computer Interaction:
INTERACT97, Chapman & Hall, London, 1997, 524-531
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
16
Carnegie Mellon
Thumbnail View with Query Relevance Bar
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
17
Carnegie Mellon
Close-up of Thumbnail with Relevance Bar
Relevance score of [0, 100]
This document has score of 30
Color-coded scoring words:
“Asylum” contributes some,
“rights” a bit,
“refugee” contributes 50%
Query-based
thumbnail
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
18
Shortcut to
storyboard
Carnegie Mellon
“Skim Video”: Extracting Significant Content
Original Video (1100 frames)
Skim Video (78 frames)
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
19
Carnegie Mellon
Skims: Preliminary Findings
• Real benefit for skims appears to be for comprehension
rather than navigation
• For PBS documentaries, information in audio track is
very important
• Empirical study conducted in September 1997 to
determine advantages of skims over subsampled video,
and synchronization requirements for audio and visuals
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
20
Carnegie Mellon
Empirical Study: Skims
DFL
DFS
NEW
Skim Audio
DFL - “default” long skim
DFS - default short skim
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
Skim Image
NEW - selective skim
RND - same audio as NEW but
with unsynchronized video
21
Carnegie Mellon
Skim Study Results
Subjects asked if
image was in the
video just seen
10
9
Images Correct
(out of 10)
8
7
6
RND
DFS
DFL
NEW FULL
15
Subjects asked if text
summarizes info. that
would be in full source
video
12.5
Phrases Correct
(out of 15)
10
7.5
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
RND
DFS
DFL
NEW FULL
Carnegie Mellon
Skim Study QUIS Results
wonderful,
satisfying, 9
stimulating
7
terriblewonderful
frustratingsatisfying
dullstimulating
5
3
terrible,
frustrating, 1
RND DFS DFL NEW FULL
dull
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
23
Carnegie Mellon
Skim Study Results*
1996 “selective” skims performed no better than
subsampled skims, but results from 1997 study show
significant differences with “selective” skims more
satisfactory to users
• audio is less choppy than earlier 1996 skim
work
• synchronization with video is better preserved
• grain size has increased
_____
*Christel, M., Smith, M., Taylor, C.R., and Winkler, D. Evolving Video
Skims into Useful Multimedia Abstractions. In Proc. ACM CHI ’98 (Los
Angeles, CA, April 1998), ACM Press, 171-178
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
24
Carnegie Mellon
Match Information
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
25
Carnegie Mellon
Using Match Information For Browsing
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
26
Carnegie Mellon
Using Match Info to Reduce Storyboard Size
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
27
Carnegie Mellon
Adding Value to Video Surrogates via Text
• Captions AND pictures better than either modality alone
•
Large, A., et al. Multimedia and Comprehension: The Relationship
among Text, Animation, and Captions. J. American Society for
Information Science 46(5) (June 1995), 340-347
• Nugent, G.C. Deaf Students' Learning from Captioned Instruction: The
Relationship between the Visual and Caption Display. J. Special
Education 17(2) (1983), 227-234
• Video surrogates better with BOTH images and text
•
Ding, W., et al. Multimodal Surrogates for Video Browsing. In Proc.
ACM Conf. on Digital Lib. (Berkeley, CA, Aug. 1999), 85-93
• Christel, M. and Warmack, A. The Effect of Text in Storyboards for
Video Navigation. In Proc. IEEE ICASSP, (Salt Lake City, UT, May
2001), Vol. III, pp. 1409-1412
• For news/documentaries, audio narrative is important,
but other video genres may be different
•
Li, F., Gupta, A., et al. Browsing Digital Video. In Proc. ACM CHI ’00
(The Hague, Neth., April 2000), 169-176
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
28
Carnegie Mellon
How Much Text, and Does Layout Matter?
NoText
AllByRow
BriefByRow
All
Brief
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
29
Carnegie Mellon
Results from Christel/Warmack Study
Mean Completion Times, in seconds:
NoText
AllByRow
All
BriefByRow
Brief
192
160
137
117
162
Graph with
95%
confidence
intervals
BriefByRow
AllByRow
NoText
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
30
All
Brief
Carnegie Mellon
More Results from Storyboard/Text Study
Mean Ranking for Treatments (1 = favorite, 5 = least favorite):
NoText
AllByRow
All
BriefByRow
Brief
4.6
1.12
3.04
2.52
3.72
AllByRow was favored,
but had relatively poor
performance (160
seconds for tasks)
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
BriefByRow ranked 2nd
by preference, and had
the best performance
(117 seconds for tasks)
31
Carnegie Mellon
Conclusions from Storyboard/Text Study
• Storyboard surrogates clearly improved with text
• Participants favored interleaved presentation
• Navigation efficiency is best served with reduced
interleaved text (BriefByRow)
• BriefByRow and All had best task performance, but
BriefByRow requires less display space
• If interleaving is done in conjunction with text reduction,
to better preserve and represent the time association
between lines of text, imagery and their affiliated video
sequence, then a storyboard with great utility for
information assessment and navigation can be
constructed.
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
32
Carnegie Mellon
Discussed Multimedia Surrogates, i.e.,
Abstractions based on Library Metadata
Temporal
skim video
match bars
Static
text
title
storyboard
thumbnail
image
Content detail, object size
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
33
Carnegie Mellon
Range of Multimedia Surrogates
Temporal
storyboard
audio with audio
data
match bars
skim video
storyboard
full text
transcript
Static
text
title
thumbnail
image
Content detail, object size
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
34
Carnegie Mellon
Evaluating Multimedia Surrogates
• Techniques discussed here:
• transaction logs
• formal empirical studies
• Other techniques used in interface refinement:
• contextual inquiry
• heuristic evaluation
• cognitive walkthroughs
• “think aloud” protocols
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
35
Carnegie Mellon
Extending to Surrogates ACROSS Video
• As digital video assets grow, so do result sets
• As automated processing techniques improve, e.g.,
speech and image processing, more metadata is
generated with which to build interfaces into video
• Need overview capability to deal with greater volume
• Prior work offered many solutions:
• Visualization By Example (VIBE) for matching entity
relationships
• Scatter plots for low dimensionality relationships, e.g.,
timelines
• Dynamic query sliders for direct manipulation of plots
• Colored maps for geographic relationships
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
36
Carnegie Mellon
Enhancing Library Utility via Better Metadata
Metadata Extractor
People
Perspective
Templates
Event
Affiliation
Location
Summarizer
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
37
Topics
Time
User Interface
(final
representation)
Carnegie Mellon
Displaying Metadata in Effective “Collages”
North Pacific
Ocean
South Pacific
Ocean
Map collage emphasizing distribution by nation of
“El Niño effects” with overlaid thumbnails
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
38
Carnegie Mellon
Zooming into “Collage” to Reveal Details
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
39
Carnegie Mellon
Example of “Chrono-Collage”
March 1998
April 1998
May 1998
Suharto economic
reform meetings
U.S. policy on
Indonesia
Habibie new
president
El Niño wildfires
Student protests against Suharto
Timeline collage emphasizing “key player faces” and short
event descriptors, representing the same data shown in
Indonesia map perspective.
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
40
Carnegie Mellon
Named Entity Extraction
F. Kubala, R. Schwartz, R. Stone, and R. Weischedel, “Named Entity
Extraction from Speech”, Proc. DARPA Workshop on Broadcast News
Understanding Systems, Lansdowne, VA, February 1998.
CNN national correspondent John Holliman is at Hartsfield
International Airport in Atlanta. Good morning, John. …But
there was one situation here at Hartsfield where one airplane
flying from Atlanta to Newark, New Jersey yesterday had a
mechanical problem and it caused a backup that spread
throughout the whole system because even though there were
a lot of planes flying to the New York area from the Atlanta
area yesterday, ….
Key: Place, Time, Organization/Person
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
41
Carnegie Mellon
Challenge: Integrating Imagery into Collages
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
42
Carnegie Mellon
Great Volume of Imagery Requires Filtering
• Video can be decomposed into shots
• Consider 2050 hours of CNN videos from 1997-
2002
• 1,688,000 shots
• 67,700 segments/stories
• 1 minute 53 seconds average story duration
• 4.5 seconds average shot duration
• 23 shots per segment on average
• Result sets for queries number in the hundreds or thousands
• Against 2001 CNN collection, top 1000 stories
for queries on “terrorism” and “bomb threat”
produced 17545 and 18804 shots respectively
• User needs a way to filter down tens of thousands of images
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
43
Carnegie Mellon
Adding Imagery to Visualizations
• Query-based thumbnail images added to VIBE, timeline,
map summaries
• Layout differs: overlap in VIBE/timeline; tile in map
• Extend concept of “highest scoring” to represent country, or
a point in time or a point on VIBE plot
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
44
Carnegie Mellon
Leveraging from Our Prior Video
Summarization Work
• Context, e.g., matching terms, and synchronization
between imagery and narrative can reduce summary
complexity
• Text with imagery more useful in video summaries than
either text alone or imagery alone
• “Overview first, zoom and filter, then details on demand”
• Visual Information-Seeking Mantra of Ben
Schneiderman
• Direct manipulation interfaces leave the user in
control
• Iterative prototyping reveals areas needing further work
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
45
Carnegie Mellon
Adding Text Overviews to Collages
• Transcript and other derived text such as scene text
and characters overlaid on broadcast video provide
input for further processing
• Named entity tagging and common phrase extraction
provides filtering mechanism to reduce text into defined
subsets
• Visualization interface allows subsets, e.g., people,
organizations, locations, and common phrases, to be
displayed for the set of documents plotted in the
visualization view
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
46
Carnegie Mellon
Example of Text-Augmented Timeline
Most frequent common phrases and people from query on
“anthrax” against 2001 news listed beneath timeline plot.
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
47
Carnegie Mellon
Example of Text-Augmented VIBE Plot
Left pane
shows videos
from 1/01 –
3/01 focus on
refugees and
asylum.
Right pane
shows videos
from 5/01 –
8/01 focus on
human rights
and stem cell
research.
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
48
Carnegie Mellon
Refinement of Collages*
• Image addition to summaries improved over time
• Anchorperson removal for more representative visuals
• Consume more space in timeline with images via better layout
• Image resizing under user control to see detail on demand
• Text addition found to require new interface controls
• Selection controls, e.g., list people, organizations, locations,
and/or common phrases
• Stopping rules, e.g., list at most X terms, list terms only if they
are covered by Y documents or Z% of document set
• Show some text where user’s attention is focused, by the mouse
pointer, i.e., pop-up tooltips text
_____
*Christel, M., et al. Collages as Dynamic Summaries for News Video. In Proc.
ACM Multimedia ’02 (Juan-les-Pins, France, December 2002)
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
49
Carnegie Mellon
NIST TREC Video Retrieval Track
• Definitive information at NIST TREC Video Track web site:
http://www-nlpir.nist.gov/projects/trecvid/
• TREC series sponsored by the National Institute of
Standards and Technology (NIST) with additional support
from other U.S. government agencies
• Goal is to encourage research in information retrieval from
large amounts of text by providing a large test collection,
uniform scoring procedures, and a forum for organizations
interested in comparing their results
• Video Retrieval Track started in 2001
• Goal is investigation of content-based retrieval from digital
video
• Focus on the shot as the unit of information retrieval rather
than the scene or story/segment/clip
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
50
Carnegie Mellon
TREC-Video 2001 and TREC-Video 2002
• 2001 collection had ~11 hours of MPEG-1 video: 260 segments,
8000 shots, 80,000 I-frames
• 2002 search test collection had ~40 hours of MPEG-1 video: 1160
segments, 14,524 shots (given by TREC-V), 292,000 I-frames
• 2001 results
• http://trec.nist.gov/pubs/trec10/t10_proceedings.html
• Definite need to define the unit of information retrieval
• Automatic search (no human in loop) difficult: about 1/3 of
queries were unanswered by any of the automatic systems
• Research groups submitting search runs were Carnegie
Mellon, Dublin City Univ., Fudan Univ. China, IBM, Johns
Hopkins Univ., Lowlands Group Netherlands, Univ. Maryland,
Univ. North Texas
• 2002 results to be published after TREC Conference in 11/02
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
51
Carnegie Mellon
TREC-Video 2001 Queries
• Specific item or person
• the planet Jupiter, corn on the cob, Ron Vaughn, Harry
Hertz, Lou Gossett Jr., R. Lynn Bonderant
• Specific fact
• number of spikes on Statue of Liberty’s crown
• Specific event or activity
• liftoff of the Space Shuttle, Ronald Reagan reading speech
about Space Shuttle
• Instances of a category
• mountains as prominent scenery, scenes with a yellow
boat, pink flowers
• Instances of events/activities
• vehicle traveling on the moon, water skiing, speaker
talking in front of the US flag, chopper landing
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
52
Carnegie Mellon
Carnegie Mellon TREC-Video 2001 Results*
Retrieval using:
Speech Recognition Transcripts only
Raw Video OCR only
Raw Video OCR + Speech Transcripts
ARR
1.84 %
5.21 %
6.36 %
Recall
13.2 %
6.10 %
19.30
%
Enhanced VOCR with dictionary post-processing
5.93 %
7.52 %
Speech Transcripts + Enhanced Video OCR
7.07 %
20.74 %
Image Retrieval only using a probabilistic Model 14.99 %
24.45 %
Image Retrieval + Speech Transcripts
14.99 %
24.45 %
Image Retrieval + Face Detection
15.04 %
25.08 %
53
© Copyright
2002 Retrieval
Michael G. Christel and
Hauptmann
Image
+Alexander
RawG.VOCR
17.34 %
Carnegie Mellon
TREC-Video 2002 Queries
• Specific item or person
• Eddie Rickenbacker, James Chandler, George Washington,
Golden Gate Bridge, Price Tower in Bartlesville, OK
• Specific fact
• Arch in Washington Square Park in NYC, map of
continental US
• Instances of a category
• football players, overhead views of cities, one or more
women standing in long dresses
• Instances of events/activities
• people spending leisure time at the beach, one or more
musicians with audible music, crowd walking in an urban
environment, locomotive approaching the viewer
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
54
Carnegie Mellon
TREC-Video 2002 Features for Auto-Detection
•
•
•
•
•
•
•
•
•
•
Outdoors: recognizably outdoor location
Indoors: recognizably indoor location
Face: at least one human face with nose, mouth, and both eyes
People: group of two more humans
Cityscape: recognizably city/urban/suburban setting
Landscape: a predominantly natural inland setting, i.e., one with
little or no evidence of development by humans
Text Overlay: superimposed text large enough to be read
Speech: human voice uttering recognizable words
Instrumental Sound: sound produced by one or more musical
instruments, including percussion instruments
Monologue: an event in which a single person is at least partially
visible and speaks for a long time without interruption by another
speaker
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
55
Carnegie Mellon
New Interface Development for TREC-V 2002
• Multiple document storyboards
• Resolution and layout under user control
• Query context plays a key role in filtering image sets to
manageable sizes
• TREC 2002 image feature set offers additional filtering
capabilities for indoor, outdoor, faces, people, etc.
• Displaying filter count and distribution guides their use
in manipulating the storyboard views
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
56
Carnegie Mellon
Multiple Document Storyboards
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
57
Carnegie Mellon
Resolution and Layout under User Control
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
58
Carnegie Mellon
Leveraging From Query Context
• User has already expressed information need via query
• Query-based thumbnail representation has proven summarization
effectiveness*
Decompose video into shots, align query
matches to shots, use highest-scoring shot
to represent video segment
• Therefore, use query-based scoring for shot selection,
reduce thousands of shots to tens or hundreds of shots
*See INTERACT '97 Conference paper by Christel et al. for more details.
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
59
Carnegie Mellon
TREC 2002 Image Feature Set
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
60
Carnegie Mellon
Filter Interface for using Image Features
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
61
Carnegie Mellon
Example: Looking for Beach Shots, 863 shots
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
62
Carnegie Mellon
Ex.: “Outdoor” Beach Shots Set at 469 Shots
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
63
Carnegie Mellon
Ex.: Beach Shot Set Manageable Size of 56
after Filtering Out Shots with No People
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
64
Carnegie Mellon
Conclusions
• Multi-document storyboard view facilitates quick inspection of large
set of images
• First-order filtering by query very useful in providing user with an
initial set of images for investigation
• Shots temporally near relevant shots often were relevant as well,
so image ordering by video segment and time useful
• Image features useful to filter, specific to certain queries
• Drill-down to details, from images to video, necessary to eliminate
ambiguity
• These strategies hold promise for finding visual information from
video corpus beyond TREC 2002 collection
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
65
Carnegie Mellon
Credits
Many Informedia Project and CMU research community members
contributed to this work; a partial list appears here:
Project Director: Howard Wactlar
User Interface: Mike Christel, Chang Huang, Adrienne Warmack, Dave
Winkler
Image Processing: Takeo Kanade, Norm Papernick, Toshio Sato, Henry
Schneiderman, Michael Smith
Speech and Language Processing: Alex Hauptmann, Ricky
Houghton, Rong Jin, Raj Reddy, Michael Witbrock
Informedia Library Essentials: Bob Baron, Bruce Cardwell, Colleen
Everett, Mark Hoy, Melissa Keaton, Bryan Maher, Craig Marcus
© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann
66
Carnegie Mellon