Image Subject Searching: An Analysis of Current Challenges

Download Report

Transcript Image Subject Searching: An Analysis of Current Challenges

Image Subject Searching:
What We Know and Where We Need to Go
Rachael Bradley
April 2007
Acknowlegement: This presentation builds upon research conducted
with CLiMB (Computational Linguistics for Metadata Building),
supported by the Mellon Foundation
1
Purpose
•Images are used in design, journalism, education,
medicine, entertainment and many other areas.
•Increasing numbers of images are available in
digital format and can be searched online.
•This presentation focuses on image subject
retrieval in order to generate evaluation criteria and
future research needs in image retrieval.
2
Outline
1.
2.
3.
4.
5.
6.
7.
8.
Introduction
Content and Meaning
User Studies
Key Characteristics
Current Technology
Evaluation
Future Research
So What?
3
Introduction
1. Introduction
–
–
2.
3.
4.
5.
6.
7.
8.
Context
Example (1)
Content and Meaning
User Studies
Key Characteristics
Current Technology
Evaluation
Future Research
So What?
4
Context
Four types of image attributes*
• Biographical
– Birth (Examples: Creator, Date, Title)
– Travels (Examples: Who has owned it, Cost )
• Exemplified (Examples: Painting, .jpg, Sculpture)
• Relationship (Examples: sketches-final painting, image-critiques)
• Subject
– Of/About (Examples: Of a lion/About pride)
This presentation focuses on subject attributes
• Additional attributes may be as important or more
important to the end user when searching for an
image
5
Example
(1)
Artemisia Gentileschi
Judith Slaying Holofernes, 1612-13
Naples, Museo di Capodimonte
Artemisia Gentileschi
Judith Slaying Holofernes, c1620
Florence, Uffizi
6
Content and Meaning
1.
Introduction
2.
Content and Meaning
–
–
–
–
–
–
–
–
–
–
–
3.
4.
5.
6.
7.
8.
Levels of Content
Image Analysis
Establishing Meaning
Types of Meaning
Example (2)
Data Sources for Establishing Creator’s Intended Meaning
Language of Images
Symbols
Example (3)
Data Sources for Establishing Audience Interpretation
Evolution of Audience Interpretation
User Studies
Key Characteristics
Current Technology
Evaluation
Future Research
So What?
7
Levels of Content
• Pre-iconography
– Generic description of objects and
events in an image
– Knowledge gained by everyday
experience is all that is needed
• Iconography
– Specific information, conventional
matter
– Requires familiarity with a specific
culture
• Iconology
– Intrinsic meaning or content
– Requires synthesis of information
(Pre-iconography + Iconography +
Knowledge of culture, artist, etc)
• Each level includes time, space,
activities/events, and/or objects
8
Panofsky-Shatford Matrix1
Pre-iconography
(Generics)
Iconography
(Specifics)
Iconology
(Abstracts)
Who?
Kind of person or
thing
Individually named,
person, group, thing
Mythical or fictitious
being
What?
Kind of event,
action, condition
Individually named
event, action
Emotion or
abstraction
Where?
Kind of place:
Geographical,
architectural
Individually named
geographic location
Place symbolized
When?
Cyclical time:
season, time of day
Linear time:
date/period
Emotion, abstraction
symbolized by time
•Each level can be divided into factual or expressional
•Simplified into Specific Of, Generic Of, and About
9
1. Shatford, 1986; Armitage and Enser, 1997
Image Analysis
• Descriptive Analysis
– Recognition and description of visual elements in a work of art
– Shapes, forms, lines and colors
• Formal Analysis
– Recognizing visual relationships between shapes, forms, lines
and colors
– Images have coherent structure held together and ordered by the
use of similar shapes, forms and colors
• Internal Analysis
– Focus on works inherent aspects (iconographic, narrative,
symbolic)
• External Analysis
– Analysis of work within a larger context (historical, ideological,
political, psychological, etc)
10
Establishing Meaning
1. Traditions of Representation
–
Known to the artist and to the actual or intended beholders
– Recorded in symbolic dictionaries or recognized through repeated use in art
2. Pictoral Context and Location
– Visual design in context of the rest of the picture
– Location of the artwork in relation to other art or the building itself
3. Social and political background
– Historical knowledge of events contemporary to the painting
4. Situation of the artist
– Training, interests, emotional conflicts, attitudes, beliefs, economical and
psychological relations to the patron and to the beholders
5. Intentions
– Intentions of the particular artist
– Intentions of most artists in a particular period
6. Responses of the beholders
– Response of particular persons in particular situations
– Response of normal people in normal situations
– Conscious vs unconscious response
11
Types of Meaning
1. Creator’s intended meaning
–
–
–
–
–
Traditions of representation
Pictoral context and location
Social and political background
Situation of the artist
Intentions
2. Audience Interpretation
–
Responses of the beholders
–
Pictoral context and location
–
Social and political background
12
Example (2)
Caravaggio, 1599
For Discussion:
Donotello, 1460
1. Traditions of Representation
2. Pictoral Context and Location
3. Social and political background
4. Situation of the artist
5. Intentions
6. Response of the beholder
13
Data Sources for Establishing
Creator’s Intended Meaning
Text sources
Image Sources
• Associated metadata
• Primary sources
• Original Image
• Related Images
–
–
–
–
Diaries
Announcements
News articles
Contracts
• Religious works/
fictional texts
• Symbolic dictionaries
• Histories
– Preliminary drawings
– Other works by creator
– Images the creator was
aware of
– Architectural drawings
14
Language of Images
Images can never mimic reality
• Limited physical media do not allow for exact representation of reality
• Images are information encoded by the creator and decoded by the
viewer
• “To say a drawing is a correct view...means that those who understand
the notation will derive no false information from the drawing (90).”
• “...the correct portrait, like the useful map, is an end product on a long
road through schema and correction. It is not a faithful record of a
visual experience but the faithful construction of a relational model
(181).”
Style defines the visual possibilities
• “Styles, like languages, differ in the sequence of articulation and in the
number of questions they allow the artist to ask (90).”
Image Internal Meaning
(Style - Artist Variations) + Symbolism + Relationships
15
• Symbols
–
–
–
–
–
Symbols
Visual Elements
Contents: Time, Space, Activities/Events, and/or Objects
The symbol makes an informed viewer will think of what it symbolizes
The viewer can specify what it symbolizes
The symbol does not depict what it symbolizes
• Natural Symbols
– A natural connection exists between the symbol and what it symbolizes
• Conventional Symbols
– A tradition exists connecting the symbol and what it symbolizes
• Identifying Symbols
–
–
–
–
Care in representation
Central/conspicuous position
Someone points to the motif
Presence is out of place
16
Example (3)
Judith I
Klimt, 1901
17
Data Sources for Establishing
Audience Interpretation
Text sources
• Primary sources
–
–
–
–
Critiques
Diaries
Announcements
News articles
• Histories
• Accession records
Image Sources
• Original Image
– Personal response
• Related Images
– Derived images
– Later works by artist
– Changes in use
– Price
18
Evolution of Audience Interpretation
Interpretation changes over time*
• Creation
– Artist Birth to Death
• Quotation
– Subsequent artists emulate images, style and technique
• Interpretation
– Frame: Classify, organize and interpret life experiences (Artist Anecdotes)
– Artist Anecdote: Story of artist’s life and work that
• Recontextualization
– Work enters broader cultural/commercial context
– Appropriation, Commercialization, Commodification
• Consumption
– Currency exchanged for some form of artist experience
Interpretation is based on individual and cultural factors
19
User Studies
1.
2.
Introduction
Content and Meaning
3.
User Studies
•
•
•
•
•
•
•
•
•
•
•
4.
5.
6.
7.
8.
Image Study Methodology
Visual Elements
Pre-Iconographic and Iconographic Terms
Variation in Search Terms
Image Constructs
Iconological Terms
Image Selection
User Confidence
Query Modification
Browsing
Additional Findings
Key Characteristics
Current Technology
Evaluation
Future Research
So What?
20
Image Study Methodology
1.
Analyzing email requests to a reference service
–
–
2.
Requests created independently from a retrieval system
Provides some contextual information
Analyzing query logs from image search engines
–
–
–
–
3.
Interface dependent
Large samples
No contextual information available
Possible bias because only queries with pre-identified image terms are selected
Self administered questionnaires describing searches
–
–
4.
Contextual information available
Testing risk
User studies involving questionnaires, interviews and/or
observations
–
–
–
Provide rich information on entire search process
Focuses on specific groups, possibly transferable but not generalizable
Keister found that requests varied by user types
21
Visual Elements
Descriptive Analysis
• Color, Line, Shape, Style, Focal Point
Study
Visual
Elements
1
100 requests to a picture archive
0%
2
29 college art history students searching for
images for a paper
1.6%
3
187 queries from 2 image archives
7%
4
404 queries to Google Answers’ Visual Arts
11%
5
108 journalism related requests
“a third*”
In user studies, use of visual elements for search has been limited.
22
*Used to distinguish between color and black and white photos
Pre-Iconographic and Iconographic Terms
• Pre-Inconographic: NonUnique, Noun, Generic
• Iconographic: Unique, Proper Noun, Specific
• Refiners, used in many studies, confuse these analyses
Study
Iconographic
1
1,749 requests from 7 different image archives
1.7% - 86%
2
1 Month of search logs from a commercial image
provider
7.1%
3
590 digital reference requests to ASKEric
21%
4
29 college art history students
35%
5
187 queries from 2 image archives
42%
6
1852 journalism related image queries
56%
7
108 journalism related request
62%
8
64 University Students
70%
The level of content description in search terms is highly variable,
likely due to task and collection differences
23
Variation in Search Terms
Observations
•
•
"It is not so much that a picture is worth a thousand words, for many fewer
words can describe a still picture for most retrieval purposes. The issue has
more to do with the fact that those words vary from one person to another
(p.17).1”
“No attempts to technically reduce such a notion to thesaurus or subject
headings could ever encompass the richness of human induction when
exposed to an image. If a picture is worth a thousand words to one viewer, it
is worth a million words to 1,000 viewers. No individual or small group of
individuals, no matter how professional or rule intensive the approach, could
ever capture a full panoply of impressions invoked by an image (p.7).2”
Results
•
•
In a study of 33,149 queries on Excite search engine from 9855 users, most
terms only occurred once. The most frequently used terms occurred less
than 10% of the time.3
In a study of image professional’s use of a commercial image provider over
one month, the top term (woman and women) occurred 7% of the time.4
In all levels of content, vocabulary varies greatly
24
Image Constructs
• Similar to Risatti’s Formal Analysis
• Introduced by Keister from analysis of reference requests
at NLM
– In an “Image Construct Query” the terms are used as a visual
construction rather than simply isolated terms.
– Examples : Man sitting in the chair with a box on his head
People racing in wheelchairs
Surgeons standing
Results
– Image constructs make up 1/3-1/2 of image requests1
– Visual constructs made up 83% requests2
Many searchers describe the object relationships within an image
25
Iconological Terms
•Only one study has specifically examined use
of Iconological Terms
•1,749 requests from 7 different image archives1
•Who:
•What:
•Where:
•When:
Mean 1.9
Mean 1.2
Mean 0
Mean 0
(Standard Deviation 3.8)
(Standard Deviation 2.3)
(Standard Deviation 0)
(Standard Deviation 0)
In the only study to examine iconology, use of iconological terms was limited.
26
Emotional Response
Five studies reference search by emotional response
Study
Emotional Resp.
1
29 college art history students searching for
images for a paper
0%
2
187 queries from 2 image archives
.03%
3
404 queries to Google Answers’ Visual Arts
1%
4
1852 journalism related queries
4%
5
64 University Students
13%
Use of emotional response terms in search has been limited but has
been used more than other iconological terms
27
Image Selection
Study of 38 faculty and graduate students of American History1
•
•
•
Topicality most important factor in making relevance judgments.
Most users did not feel comfortable making a relevance judgment based on image alone.
To make a final judgment users used both image and text.
User study of journalism related requests2
•
Selection based on (in order):
1) Topicality, often based on caption
2) Technical and biographical criteria
3) Impression to be conveyed (Difficult to convey in words)
User study of journalism related image queries3
•
•
“Topicality was a necessary but insufficient criterion for relevance...Final selection criteria
could also be preferential or reactive; selections were based on personal impressions of
images being ‘more interesting’', ‘funny’, ‘different’, ‘most dramatic'. (p. 107)”
“Searchers tended to alternate between viewing the textual description and the actual
image during the selection process (p. 106).”
Iconological factors become increasingly important during selection.
28
Associated text is necessary during selection.
User Confidence
•
“Image needs were often fuzzy and could not be fully explicated. Most often it
was however possible to name a critical object that should appear in the
image. The search was then based on querying for this object (105).1”
•
No consistent rational manner for asking for pictures(9).2
•
“The difficulty users often have in translating their image needs into verbal or
written expressions is exemplified by the patron who states, ‘I can’t tell you
what I want, but I’ll know it when I see it! (46).3’”
•
“The selection of search keys for general search topics was considered
difficult. Journalists presumed that the archive contained photos relating to
topics of interest, but they just had not discovered the right way to retrieve
them (275)4.”
Users find it difficult to express image information needs in words
29
Query Modification
Study of 33,149 queries on Excite search engine from 9855 users1
• 40% of queries were first time queries and 60% were modified
Study of image professional’s use of a commercial image provider
• From 420 image search sessions, the mean number of queries per
search 2.1
• 48% of queries were modified
– 14% added one or more terms
– 5.6% eliminated one or more terms
– 28% changed one or more terms
Approximately half of all queries are modified.
30
Browsing
Study of 64 university student’s online image queries1
• Browsing was the primary strategy in satisfying 20% of information needs (199)
User study of journalism related requests2
• Browsing was the main search strategy
• "General search topics easily led to multiple queries and heavy browsing. Specific
needs led more likely to just one or two queries and browsing sessions (274).“
• Trial and error method rather than carefully constructed queries
Study of 1852 journalism related image queries3
• Browsing was the main search strategy after the initial query and especially important
in abstract image needs and collaborative retrieval (105).
Study of image professional’s use of a commercial image provider4
• Browsing took place in 90% of sessions in Sample 1 and 81% of sessions in Sample 2.
• An average of 93 thumbnails were browsed per session in Sample 1 and 129 in
Sample 2 (1354).
Browsing is important during search and selection.
31
Additional Findings
Query by Example
Study of 404 queries to Google Answers’ Visual Arts1
• 10% provided examples (Cunningham, Bainbridge and Masoodian, 48)
Study of 1 Month of search logs from a commercial image provider2
• “Other changes to queries included using terms that appear in image captions as
additional terms ...these represent a change in search strategy to a “query by example”
(QBE) form of search, but using text associated with the image rather than the image
itself.” (Jorgensen and Jorgensen, 1355)
Query for All Existing Material
User study of journalism related image queries3
• Requests for all existing material on a certain topic accounted for nearly tenth of all
image requests. This type of request has yet to receive any attention, even though it
might affect retrieval measures such as recall (109)
Query by example and specifying all existing material may be
important to some users.
32
Key Characteristics
1.
2.
3.
4.
Introduction
Content and Meaning
User Studies
Key Characteristics
•
•
5.
6.
7.
8.
Access Characteristics
Search and Selection Characteristics
Current Technology
Evaluation
Future Research
So What?
33
Access Characteristics
Increasingly complex and variable access points
• Visual Elements
– Rarely used
• Pre-iconography and Iconography
–
–
–
–
–
Often used
Use likely varies based on collection and tasks
Level of description can vary by individual, collection and task
Terminology can vary by individual and collection
Relationships between items is often important
• Iconology
–
–
–
–
Rarely used
Level of description can vary by individual, collection and task
Interpretations vary widely by individual
Terminology also varies
Although use of Visual Elements and Iconology has been rarely
observed to date, this may be a result of testing limitations. 34
Search and Selection Characteristics
• Users lack confidence in expressing their image needs
• Users often modify queries based on results
• Browsing is a key strategy in image search and selection
• Iconology becomes increasingly important during
selection
• Both the image and associated text are important during
selection
35
Current Technology
1.
2.
3.
4.
Introduction
Content and Meaning
User Studies
Key Characteristics
5. Current Technology
•
•
•
Concept-Based Retrieval
Content-Based Retrieval
Social Tagging
6. Evaluation
7. Future Research
8. So What?
36
Concept-Based Retrieval
Text to Text retrieval
Text Associated with Images
– Metadata
– Ontologies and Classification Schemes
– Keyword search on associated texts
Challenges
– Term agreement
– Subjectivity
– Level of agreement
37
Content-Based Retrieval
Image to image retrieval
•
•
•
Color
– Possible users: medical diagnosis, fashion and interior design, art history,
journalism and advertising
– Overall color or color by location
Texture
– Coarseness, contrast and directionality
Shape
– Boundaries or regions
– Face Recognition
– Difficulties disambiguating foreground and background?
Query by Example
•
•
Input an example image or better yet set of images (typically selected)
Model the desired color, texture or shape (selected or created)
Challenges
• 3-Dimensions
• Boundary delineation (foreground and background)
• Variations in angles
38
Social Tagging
• Allows the general public as well as professional community to apply
text descriptions to images
• Steve.museum
– “At The Metropolitan Museum of Art, early studies indicate a significant
variation between the existing collections documentation – recording
artist, date, medium, dimensions, and iconography – and the words that
are supplied by naïve viewers, describing the visual elements of an image
and what it ‘literally’ depicts.1”
Challenges
• Vocabulary quality
• Interface design
39
Evaluation
1.
2.
3.
4.
5.
6.
Introduction
Content and Meaning
User Studies
Key Characteristics
Current Technology
Evaluation
•
Evaluation Criteria
7. Future Research
8. So What?
40
Evaluation Criteria
Does the image retrieval system support:
• searching by visual elements (likely using content
based retrieval methods)?
• query expansion methods for pre-iconographical and
iconographical terms?
• keyword searching of associated text for preiconographical, iconographical and iconological terms?
• social tagging: a large number of users can apply
iconological and other terms to images?
• browsing both images and associated text?
• query modification?
• query by example?
41
Future Research
1.
2.
3.
4.
5.
6.
7.
Introduction
Content and Meaning
User Studies
Key Characteristics
Current Technology
Evaluation
Future Research
•
•
•
Task Types
Type of Image Need
Additional Technology
8. So What?
42
Task Types
• Attentional: Maintain or draw attention
• Retentional: Assist with recall
• Explicative: Explain visually what would be cumbersome
to explain verbally
–
–
–
–
–
–
–
Descriptive: Show what an object looks like
Expressive: Make an impact on a reader
Constructional: Explain how various components fit together
Functional: Enable the viewer to follow a process or organization
Logico-mathematical: Diagram mathematical concepts
Algorithmic: To show possibilities
Data-display: Allow quick comparison and easy access to data
• The majority of studies have focused on descriptive tasks.
Research Question 1:
How do search and selection strategies change across task types?
43
Type of Image Need
• Image Needs
Number of Images Needed
Specificity
of Image
Need
Specific/Single
Specific/Multiple
Generic/Single
Generic/Multiple
• A preliminary experimental study indicated that keyword
searching increased and browsing decreased with the
specificity of the image need1.
• Studies of journalism image search indicate that selection
strategies differ for single and multiple images2.
Research Question 2:
How do search and selection strategies change with image needs?
44
Additional Technology
• Style recognition has not been addressed
• Social tagging has not been fully exploited as a
mechanism for broadening iconological
terminology
• Available technologies have not been combined
to create an overall image search experience
Research Question 3:
Is style recognition technologically feasible?
Research Question 4:
How can social tagging be used to improve the search experience?
Research Question 5:
How can concept-based retrieval, content-based retrieval, and social
tagging retrieval be combined successfully?
45
So What?
1.
2.
3.
4.
5.
6.
7.
8.
9.
Introduction
Topicality vs Contents
Image Subject Search
Image Relevance
Key Characteristics
Current Technology
Evaluation Methods
Future Research Studies
So What?
46
Take Home Message
• Research is currently being conducted in both
content-based and context-based image retrieval
but they are not coordinated
• Variations in terminology and categorization
across theory, user studies, and technology
studies make it difficult to build on previous
knowledge
• Combining theory from various disciplines and
empirical knowledge in image retrieval will
provide the best chance of creating a successful
search and selection experience.
47