Presentation of search results - Rutgers University School
Download
Report
Transcript Presentation of search results - Rutgers University School
Relevance
in information science
Tefko Saracevic, PhD
[email protected]
http://www.scils.rutgers.edu/~tefko/
© 2008 Tefko Saracevic
1
Preface
1975 & 1976
Relevance: A Review of the Literature
and a Framework for Thinking on the
Notion in Information Science
2007
Relevance: A Review of the Literature
and a Framework for Thinking on the
Notion in Information Science. Part II &
part III
© 2008 Tefko Saracevic
Relevance
2
Purpose
trace the evolution of thinking on
relevance in information science for
the past three decades
provide an updated framework within
which the still widely dissonant ideas
on relevance might be interpreted
and related to one another
© 2008 Tefko Saracevic
Relevance
3
… in information science
Information retrieval (IR) systems
offer their version of what may be
relevant
People go their way & asses relevance
The two worlds interact
Concern here:
human world of relevance
how IR deals with relevance NOT covered
© 2008 Tefko Saracevic
Relevance
4
Historical note
Relevance came into IR unannounced
From very start after WWII IR was
about retrieval of relevant information
First concerns: about “false drops” –
non relevant retrievals
First direct recognition in 1955 with
proposal for precision & recall as
evaluation measures based on relevance
© 2008 Tefko Saracevic
Relevance
5
Organization of
presentation
1. What is the NATURE of relevance?
1.1 Meaning?
1.2 Theories?
1.3 Models?
2. What are MANIFESTATIONS of
relevance?
3. What is the BEHAVIOR of
relevance?
4. What are the EFFECTS of
relevance?
© 2008 Tefko Saracevic
Relevance
6
1.1 Meaning of relevance
Intuitively well understood
same perception globally – “y’know”
a “to” and context always present
Relevance:
a relation between objects P & Q along
property R
may also include a measure S of the strength
of connection
© 2008 Tefko Saracevic
Relevance
7
… in information science
Relation between inf. or inf. objects
(Ps) & contexts (Qs) based on some
property R & measure of intensity S
But: relevance is not given – it is
established
Big questions & challenges
How does relevance happen?
Who does it, under what circumstances,
and how?
© 2008 Tefko Saracevic
Relevance
8
Summary:
Meaning of relevance
Relevance attributes involve:
relation
intention
context
internal; external
inference
selection
interaction
measurement
© 2008 Tefko Saracevic
Relevance
9
1.2 Theories of relevance
Theories suggested in several fields
logic – in deduction of inferences, to reject
fallacies – relevance logics
philosophy – in phenomenology structure &
functioning of the “life-world” (Alfred Schutz)
it is stratified & relevance is the principle for
stratification
no single relevance but an interdependent
system of relevances (plural)
thematic (topical), interpretational, &
motivational relevance
© 2008 Tefko Saracevic
Relevance
10
… theories in
communication
Sperber & Wilson: Relevance Theory
based on inferential model of communication
what must be relevant and why to an
individual with a single cognitive intention?
posited a cognitive & a communicative
principle of relevance
assessed in terms of cognitive effects &
processing effort
individuals pick up the most relevant stimuli &
process them to maximize their relevance
© 2008 Tefko Saracevic
Relevance
11
Summary:
Theories of relevance
IS did not develop indigenous theory
but a few “theories-on-loan” attempts
Logic theories not applied, yet
Schutz’s life-world theory used to some
extend in specifying manifestations &
models
Sperber & Wilson’s theory used few
times as explanation & guide
not tested
© 2008 Tefko Saracevic
Relevance
12
1.3 Models of relevance
Reviews – also produced models
Syracuse school of relevance
dynamic & situational model
& Nilan, 1990)
(Schamber, Eisenberg
connection with human information behavior
“Whole history of relevance”
(Mizzaro, 1997)
duality in modeling & studying
documents & queries, 1959-1976
dynamics & multidimensionality, 1977-present
© 2008 Tefko Saracevic
Relevance
13
Split between system &
user models
Opposing views of IR: systems & users
IR traditional model does not deal with users
Battle royal started by Dervin & Nilan
(1986)
criticism of system viewpoint
call for user orientation
Several user & interaction models
proposed – to reconcile, bridge
still relevance has two basic models &
cultures & they map like Australia
© 2008 Tefko Saracevic
Relevance
14
User vs system models
“Informing systems design”
became mantra of all relevance studies
“Tell us what to do and we will do it.”
response from systems side
But “telling” is not that simple
Issue is not a conflict but:
how can we make user & system side work
together for benefit of both?
© 2008 Tefko Saracevic
Relevance
15
Summary:
relevance models
All IR & inf. seeking models have
relevance at their base
Traditional IR model has most
simplified –“weak”- version of relevance
but with the weak model IR is successful
Variety of integrative models have
been proposed
more complex models = increased challenge
to incorporate in practice
© 2008 Tefko Saracevic
Relevance
16
© 2008 Tefko Saracevic
Relevance
17
2. Manifestations of
relevance
“How many relevances in IR?” (Mizzaro, 1998)
Several manifestations recognized since 1950s
Issue: What given objects (Ps & Qs) are related
by what given property (Rs) as relation?
[adjective] relevance or different name
Duality strikes again
subject (topic, system) relevance
vs user (psychological, cognitive) relevance
objective vs subjective relevance
© 2008 Tefko Saracevic
Relevance
18
Issue of primacy:
weak and strong relevance
Does topical relevance underlie all
others?
predictably two answers: yes and no
in a strict correspondence between
query & answer topical is basic – weak
relevance
if derivation is involved topical may not
be basic – strong relevance
weak relevance is more associated with
systems
strong more with people
© 2008 Tefko Saracevic
Relevance
19
Beyond duality
Numerous other kinds of relevance
were identified
for user relevances:
psychological, cognitive, affective, situational,
socio-cognitive, pertinence, utility …
for topical relevances:
logical, systems, algorithmic, documentary,
bibliographic …
each indicates different relations
© 2008 Tefko Saracevic
Relevance
20
Summary:
Manifestations of relevance
There are a limited number of
manifestations – could be grouped:
System or algorithmic relevance
Topical or subject relevance
Cognitive relevance or pertinence
Situational relevance or utility
Affective relevance
These are interdependent – they
feed on each other interactively
© 2008 Tefko Saracevic
Relevance
21
3. Behavior of relevance
Relevance does not behave, people do
how humans determine relevance of
information or information objects?
reviewed only studies that have data
some 30 studies – started in 1991
related to information seeking & use studies &
implicit relevance studies – not reviewed
Pattern for review:
[author] used [subjects] to do [tasks] in order to
study [object of research]
© 2008 Tefko Saracevic
Relevance
22
Relevance clues
What makes information or information
objects relevant? What do people look
for in order to infer relevance?
two approaches: topic & clues analysis
Clues research:
uncover & classify attributes or criteria that
users concentrate on while making relevance
inferences
usually on documents but also other objects
© 2008 Tefko Saracevic
Relevance
23
Relevance dynamics
Do relevance inferences and criteria
change over time for the same user
and task, and if so, how?
As a user progresses through various
stages of a task
the user’s cognitive state changes
the task changes as well
thus, something about relevance also is
changing.
© 2008 Tefko Saracevic
Relevance
24
Relevance feedback
What factors affect the process of
relevance feedback?
What types of feedback? How much is
feedback used?
Dealing here with manual not
automatic feedback
behavior of people when involved in
manual feedback
© 2008 Tefko Saracevic
Relevance
25
Summary on behavior
Caveats abound – nothing standardized
still refreshing to see data
Some generalizations on clues:
criteria finite in number, similar, but
different weights assigned
Different users, tasks, progress in tasks, classes
of users = similar criteria = different weights
Different ratings of relevance = similar criteria
= different weights.
© 2008 Tefko Saracevic
Relevance
26
…summary clues
Clues criteria:
content
object
validity
use or situational match
cognitive match
affective match
belief match
© 2008 Tefko Saracevic
Relevance
27
… summary clues
Criteria are not independent; people
apply multiple criteria; they interact
content (topic) criteria very important
but not sole – interact with others
for search outputs value of results as a
whole critical
Visual information = faster inference
than textual information
© 2008 Tefko Saracevic
Relevance
28
… summary dynamics
Inferences dependent on task stage
criteria stable, selection changes
Different stages = differing selections
but different stages = similar criteria =
different weights
Increased focus = increased
discrimination = more stringent
relevance inferences
What is topical changes with progress in
time and task
© 2008 Tefko Saracevic
Relevance
29
… summary feedback
Several kinds
search term, content, magnitude, tactics
Use of relevance feedback = increase
in performance
however, used rarely in practice
Searching behavior different when
using feedback
© 2008 Tefko Saracevic
Relevance
30
5. Effects of relevance
Works both ways: relevance affected by
and affects host of factors
Relevance judges
What factors inherent in relevance judges make
a difference in relevance inferences?
How large are and what affects individual
differences in relevance inferences?
similar question asked for a number of information
activities – indexing, searching …
most often studied: domain knowledge
© 2008 Tefko Saracevic
Relevance
31
Relevance judgments
What factors affect relevance
judgments?
short answer: a lot of them
approach: classify into tables e.g.
Schamber (1994) 80 factors, 6 categories
Harter (1996) 24 factors, 4 categories
different approach here:
classify studies along basic assumptions in IR
evaluations
© 2008 Tefko Saracevic
Relevance
32
Central assumptions
Relevance is:
topical
binary
independent
stable
consistent
if pooling: complete
Not to prove or disprove these assumptions
but to organize studies along questions
© 2008 Tefko Saracevic
Relevance
33
Beyond topicality
Do people infer relevance based on
topicality only?
Other factors enter & interact
Only a few studies directly addressed
Wang & Soergel (1998) 11 criteria for
selection, with topicality being top
Xu & Chen (2006) in web searching: topicality &
novelty most significant, then reliability &
understandability
© 2008 Tefko Saracevic
Relevance
34
Beyond binary
Are relevance inferences binary i.e.
relevant – not relevant? If not, what
gradation do people use in inferences
about relevance of information or
information objects?
Number of studies addressed this
studied distributions of relevance
inferences
regions of relevance
© 2008 Tefko Saracevic
Relevance
35
Beyond independence
Are information objects assessed
independently of each other? Does the
order or size of the presentation affect
relevance judgments?
Only a few studies on the questions
Includes presentation of different
representations
© 2008 Tefko Saracevic
Relevance
36
Beyond stability
Are relevance judgments stable as
tasks and other aspects change? Do
relevance inferences and criteria
change over time for the same user and
task, and if so how?
mentioned already under dynamics
judgments not completely stable, criteria are
Plato: “Everything is flux.”
© 2008 Tefko Saracevic
Relevance
37
Beyond consistency
Are relevance judgments consistent
among judges or group of judges?
human judgments about anything informational
are not consistent, relevance included
Gull (1956) opened the Pandora box
classic example of law of unintended
consequences
Some 6 studies addressed consistency
subjects: experts, students
© 2008 Tefko Saracevic
Relevance
38
But does it matter?
How does inconsistency in human
relevance judgments affect results of
IR evaluation?
main contention by critics
Five studies 1968 – 2000 addressed
the question
four also showed magnitude of agreement
© 2008 Tefko Saracevic
Relevance
39
Summary: judges
Subject expertise accounts strongly
higher expertise = higher agreement, less
differences
lesser expertise = more leniency in judgment
large variability in relevance inferences
by individuals
same range as in other cognitive processes
© 2008 Tefko Saracevic
Relevance
40
Summary: judgments
Relevance is measurable!
None of the 5 postulates hold
but by simplifying relevance for labs IR
made significant advances
Relevance judgments are not binary
but are bimodal
regions of low, middle and high relevance
high peaks at both end
Order affect relevance judgment
© 2008 Tefko Saracevic
Relevance
41
… summary: judgments
Consistency:
Higher expertise = higher consistency =
more stringent. Lower expertise = lower
consistency= more encompassing
overlap using different populations hovers
around 30%
higher expertise up to 80%
when 3rd, 4th … judge added overlap falls
Higher expertise =larger overlap. Lower
expertise =smaller overlap. More judges =
less overlap.
© 2008 Tefko Saracevic
Relevance
42
Summary: does it matter?
In lab conditions disagreement among
judges does not affect evaluation
rank order of different IR systems changes
minimally
Different judges = same relative performance (on
the average)
swaps in ranking do occur = low probability
but performance for individual topics differs
significantly
law of averages kicks in
© 2008 Tefko Saracevic
Relevance
43
Summary: measures
Users can use a variety of scales
there is no “best” scale
magnitude scales very appropriate but
hard to explain & analyze
© 2008 Tefko Saracevic
Relevance
44
Epilogue
Many things changed in IR &
information science but goals the same
As to nature of relevance:
marked progress in understanding
little in theory
diversification in models in models
As to manifestations:
consensus there are several kinds of
relevance, grouped in a half dozen or so well
distinguished classes - interdependent
© 2008 Tefko Saracevic
Relevance
45
… epilogue
As to behavior & effects:
seen a number of experimental &
observational studies
lifted the discourse beyond debate,
anecdotes to data interpretation
but generalizations difficult – findings
should be treated as hypotheses
© 2008 Tefko Saracevic
Relevance
46
… epilogue: reflections
Relevance is poor
no funding for relevance research
of studies with data less than 17% mentioned outside
funding, half from outside the US
scholarship progressed sporadically & all over the
place
Globalization of IR – globalization of
relevance
as relevance went global & to the masses many
& different research questions emerged
© 2008 Tefko Saracevic
Relevance
47
… epilogue: reflections…
Proprietary IR – proprietary relevance
major search engines proprietary =
relevance proprietary
for innovation must study users & use, but
findings kept private
relevance research into public & private branch
paradox of the internet
© 2008 Tefko Saracevic
Relevance
48
… epilogue: research
agenda - beyonds
Beyond behaviorism & black box
many studies stimulus-response & no
diagnostics
need to go use/adapt other approaches
© 2008 Tefko Saracevic
Relevance
49
… epilogue: research
agenda - beyonds
Beyond mantra
“implication for system design”
incorporating user concerns &
characteristics not that simple
integration between user/cognitive &
systems approaches needed
relevance research and IR research should
at least get engaged, if not married
interactive research on the right track
© 2008 Tefko Saracevic
Relevance
50
… epilogue: research
agenda - beyonds
Beyond students
~70% of behavior & effect studies used
students as population
not surprising – they are affordable
we know a lot about student relevance
does it generalize to other populations?
© 2008 Tefko Saracevic
Relevance
51
In conclusion
Information technology & systems
will change dramatically
even in the short run
and in unforeseeable directions
But relevance is here to stay!
© 2008 Tefko Saracevic
Relevance
52
© 2008 Tefko Saracevic
Relevance
53
© 2008 Tefko Saracevic
Relevance
54