Presentation of search results - Rutgers University School

Download Report

Transcript Presentation of search results - Rutgers University School

Relevance
in information science
Tefko Saracevic, PhD
[email protected]
http://www.scils.rutgers.edu/~tefko/
© 2008 Tefko Saracevic
1
Preface
1975 & 1976

Relevance: A Review of the Literature
and a Framework for Thinking on the
Notion in Information Science
2007

Relevance: A Review of the Literature
and a Framework for Thinking on the
Notion in Information Science. Part II &
part III
© 2008 Tefko Saracevic
Relevance
2
Purpose
trace the evolution of thinking on
relevance in information science for
the past three decades
provide an updated framework within
which the still widely dissonant ideas
on relevance might be interpreted
and related to one another
© 2008 Tefko Saracevic
Relevance
3
… in information science
Information retrieval (IR) systems
offer their version of what may be
relevant
People go their way & asses relevance
The two worlds interact
Concern here:


human world of relevance
how IR deals with relevance NOT covered
© 2008 Tefko Saracevic
Relevance
4
Historical note
Relevance came into IR unannounced
From very start after WWII IR was
about retrieval of relevant information
First concerns: about “false drops” –
non relevant retrievals
First direct recognition in 1955 with
proposal for precision & recall as
evaluation measures based on relevance
© 2008 Tefko Saracevic
Relevance
5
Organization of
presentation
1. What is the NATURE of relevance?
 1.1 Meaning?
 1.2 Theories?
 1.3 Models?
2. What are MANIFESTATIONS of
relevance?
3. What is the BEHAVIOR of
relevance?
4. What are the EFFECTS of
relevance?
© 2008 Tefko Saracevic
Relevance
6
1.1 Meaning of relevance
Intuitively well understood


same perception globally – “y’know”
a “to” and context always present
Relevance:


a relation between objects P & Q along
property R
may also include a measure S of the strength
of connection
© 2008 Tefko Saracevic
Relevance
7
… in information science
Relation between inf. or inf. objects
(Ps) & contexts (Qs) based on some
property R & measure of intensity S
But: relevance is not given – it is
established
Big questions & challenges


How does relevance happen?
Who does it, under what circumstances,
and how?
© 2008 Tefko Saracevic
Relevance
8
Summary:
Meaning of relevance
Relevance attributes involve:



relation
intention
context
 internal; external




inference
selection
interaction
measurement
© 2008 Tefko Saracevic
Relevance
9
1.2 Theories of relevance
Theories suggested in several fields


logic – in deduction of inferences, to reject
fallacies – relevance logics
philosophy – in phenomenology structure &
functioning of the “life-world” (Alfred Schutz)
 it is stratified & relevance is the principle for
stratification
 no single relevance but an interdependent
system of relevances (plural)
 thematic (topical), interpretational, &
motivational relevance
© 2008 Tefko Saracevic
Relevance
10
… theories in
communication
Sperber & Wilson: Relevance Theory




based on inferential model of communication
what must be relevant and why to an
individual with a single cognitive intention?
posited a cognitive & a communicative
principle of relevance
assessed in terms of cognitive effects &
processing effort
 individuals pick up the most relevant stimuli &
process them to maximize their relevance
© 2008 Tefko Saracevic
Relevance
11
Summary:
Theories of relevance
IS did not develop indigenous theory
but a few “theories-on-loan” attempts



Logic theories not applied, yet
Schutz’s life-world theory used to some
extend in specifying manifestations &
models
Sperber & Wilson’s theory used few
times as explanation & guide
 not tested
© 2008 Tefko Saracevic
Relevance
12
1.3 Models of relevance
Reviews – also produced models
Syracuse school of relevance

dynamic & situational model
& Nilan, 1990)
(Schamber, Eisenberg
 connection with human information behavior
“Whole history of relevance”

(Mizzaro, 1997)
duality in modeling & studying
 documents & queries, 1959-1976
 dynamics & multidimensionality, 1977-present
© 2008 Tefko Saracevic
Relevance
13
Split between system &
user models
Opposing views of IR: systems & users
 IR traditional model does not deal with users
Battle royal started by Dervin & Nilan
(1986)
 criticism of system viewpoint
 call for user orientation
Several user & interaction models
proposed – to reconcile, bridge

still relevance has two basic models &
cultures & they map like Australia
© 2008 Tefko Saracevic
Relevance
14
User vs system models
“Informing systems design”

became mantra of all relevance studies
“Tell us what to do and we will do it.”

response from systems side
But “telling” is not that simple
Issue is not a conflict but:

how can we make user & system side work
together for benefit of both?
© 2008 Tefko Saracevic
Relevance
15
Summary:
relevance models
All IR & inf. seeking models have
relevance at their base
Traditional IR model has most
simplified –“weak”- version of relevance

but with the weak model IR is successful
Variety of integrative models have
been proposed

more complex models = increased challenge
to incorporate in practice
© 2008 Tefko Saracevic
Relevance
16
© 2008 Tefko Saracevic
Relevance
17
2. Manifestations of
relevance
“How many relevances in IR?” (Mizzaro, 1998)
 Several manifestations recognized since 1950s
 Issue: What given objects (Ps & Qs) are related
by what given property (Rs) as relation?
 [adjective] relevance or different name
Duality strikes again


subject (topic, system) relevance
vs user (psychological, cognitive) relevance
objective vs subjective relevance
© 2008 Tefko Saracevic
Relevance
18
Issue of primacy:
weak and strong relevance
Does topical relevance underlie all
others?
 predictably two answers: yes and no


in a strict correspondence between
query & answer topical is basic – weak
relevance
if derivation is involved topical may not
be basic – strong relevance
 weak relevance is more associated with
systems
 strong more with people
© 2008 Tefko Saracevic
Relevance
19
Beyond duality
Numerous other kinds of relevance
were identified

for user relevances:
 psychological, cognitive, affective, situational,
socio-cognitive, pertinence, utility …

for topical relevances:
 logical, systems, algorithmic, documentary,
bibliographic …

each indicates different relations
© 2008 Tefko Saracevic
Relevance
20
Summary:
Manifestations of relevance
There are a limited number of
manifestations – could be grouped:





System or algorithmic relevance
Topical or subject relevance
Cognitive relevance or pertinence
Situational relevance or utility
Affective relevance
These are interdependent – they
feed on each other interactively
© 2008 Tefko Saracevic
Relevance
21
3. Behavior of relevance
Relevance does not behave, people do


how humans determine relevance of
information or information objects?
reviewed only studies that have data
 some 30 studies – started in 1991
 related to information seeking & use studies &
implicit relevance studies – not reviewed
Pattern for review:
 [author] used [subjects] to do [tasks] in order to
study [object of research]
© 2008 Tefko Saracevic
Relevance
22
Relevance clues
What makes information or information
objects relevant? What do people look
for in order to infer relevance?
 two approaches: topic & clues analysis
Clues research:
 uncover & classify attributes or criteria that
users concentrate on while making relevance
inferences
 usually on documents but also other objects
© 2008 Tefko Saracevic
Relevance
23
Relevance dynamics
Do relevance inferences and criteria
change over time for the same user
and task, and if so, how?
As a user progresses through various
stages of a task



the user’s cognitive state changes
the task changes as well
thus, something about relevance also is
changing.
© 2008 Tefko Saracevic
Relevance
24
Relevance feedback
What factors affect the process of
relevance feedback?

What types of feedback? How much is
feedback used?
Dealing here with manual not
automatic feedback

behavior of people when involved in
manual feedback
© 2008 Tefko Saracevic
Relevance
25
Summary on behavior
Caveats abound – nothing standardized

still refreshing to see data
Some generalizations on clues:

criteria finite in number, similar, but
different weights assigned
 Different users, tasks, progress in tasks, classes
of users = similar criteria = different weights
 Different ratings of relevance = similar criteria
= different weights.
© 2008 Tefko Saracevic
Relevance
26
…summary clues
Clues criteria:







content
object
validity
use or situational match
cognitive match
affective match
belief match
© 2008 Tefko Saracevic
Relevance
27
… summary clues
Criteria are not independent; people
apply multiple criteria; they interact


content (topic) criteria very important
but not sole – interact with others
for search outputs value of results as a
whole critical
Visual information = faster inference
than textual information
© 2008 Tefko Saracevic
Relevance
28
… summary dynamics
Inferences dependent on task stage




criteria stable, selection changes
Different stages = differing selections
but different stages = similar criteria =
different weights
Increased focus = increased
discrimination = more stringent
relevance inferences
What is topical changes with progress in
time and task
© 2008 Tefko Saracevic
Relevance
29
… summary feedback
Several kinds

search term, content, magnitude, tactics
Use of relevance feedback = increase
in performance

however, used rarely in practice
Searching behavior different when
using feedback
© 2008 Tefko Saracevic
Relevance
30
5. Effects of relevance
Works both ways: relevance affected by
and affects host of factors
Relevance judges


What factors inherent in relevance judges make
a difference in relevance inferences?
How large are and what affects individual
differences in relevance inferences?
 similar question asked for a number of information
activities – indexing, searching …
 most often studied: domain knowledge
© 2008 Tefko Saracevic
Relevance
31
Relevance judgments
What factors affect relevance
judgments?


short answer: a lot of them
approach: classify into tables e.g.
 Schamber (1994) 80 factors, 6 categories
 Harter (1996) 24 factors, 4 categories

different approach here:
 classify studies along basic assumptions in IR
evaluations
© 2008 Tefko Saracevic
Relevance
32
Central assumptions
Relevance is:






topical
binary
independent
stable
consistent
if pooling: complete
Not to prove or disprove these assumptions
but to organize studies along questions
© 2008 Tefko Saracevic
Relevance
33
Beyond topicality
Do people infer relevance based on
topicality only?
Other factors enter & interact
Only a few studies directly addressed
 Wang & Soergel (1998) 11 criteria for
selection, with topicality being top
 Xu & Chen (2006) in web searching: topicality &
novelty most significant, then reliability &
understandability
© 2008 Tefko Saracevic
Relevance
34
Beyond binary
Are relevance inferences binary i.e.
relevant – not relevant? If not, what
gradation do people use in inferences
about relevance of information or
information objects?
Number of studies addressed this


studied distributions of relevance
inferences
regions of relevance
© 2008 Tefko Saracevic
Relevance
35
Beyond independence
Are information objects assessed
independently of each other? Does the
order or size of the presentation affect
relevance judgments?


Only a few studies on the questions
Includes presentation of different
representations
© 2008 Tefko Saracevic
Relevance
36
Beyond stability
Are relevance judgments stable as
tasks and other aspects change? Do
relevance inferences and criteria
change over time for the same user and
task, and if so how?

mentioned already under dynamics
 judgments not completely stable, criteria are
 Plato: “Everything is flux.”
© 2008 Tefko Saracevic
Relevance
37
Beyond consistency
Are relevance judgments consistent
among judges or group of judges?
 human judgments about anything informational
are not consistent, relevance included
Gull (1956) opened the Pandora box
 classic example of law of unintended
consequences
Some 6 studies addressed consistency

subjects: experts, students
© 2008 Tefko Saracevic
Relevance
38
But does it matter?
How does inconsistency in human
relevance judgments affect results of
IR evaluation?

main contention by critics
Five studies 1968 – 2000 addressed
the question

four also showed magnitude of agreement
© 2008 Tefko Saracevic
Relevance
39
Summary: judges
Subject expertise accounts strongly
 higher expertise = higher agreement, less
differences
 lesser expertise = more leniency in judgment

large variability in relevance inferences
by individuals
 same range as in other cognitive processes
© 2008 Tefko Saracevic
Relevance
40
Summary: judgments
Relevance is measurable!
None of the 5 postulates hold

but by simplifying relevance for labs IR
made significant advances
Relevance judgments are not binary
but are bimodal

regions of low, middle and high relevance
 high peaks at both end
Order affect relevance judgment
© 2008 Tefko Saracevic
Relevance
41
… summary: judgments
Consistency:
 Higher expertise = higher consistency =
more stringent. Lower expertise = lower
consistency= more encompassing
 overlap using different populations hovers
around 30%
 higher expertise up to 80%
 when 3rd, 4th … judge added overlap falls
 Higher expertise =larger overlap. Lower
expertise =smaller overlap. More judges =
less overlap.
© 2008 Tefko Saracevic
Relevance
42
Summary: does it matter?
In lab conditions disagreement among
judges does not affect evaluation

rank order of different IR systems changes
minimally
 Different judges = same relative performance (on
the average)


swaps in ranking do occur = low probability
but performance for individual topics differs
significantly
 law of averages kicks in
© 2008 Tefko Saracevic
Relevance
43
Summary: measures
Users can use a variety of scales


there is no “best” scale
magnitude scales very appropriate but
hard to explain & analyze
© 2008 Tefko Saracevic
Relevance
44
Epilogue
Many things changed in IR &
information science but goals the same
As to nature of relevance:
 marked progress in understanding
 little in theory
 diversification in models in models
As to manifestations:
 consensus there are several kinds of
relevance, grouped in a half dozen or so well
distinguished classes - interdependent
© 2008 Tefko Saracevic
Relevance
45
… epilogue
As to behavior & effects:

seen a number of experimental &
observational studies
 lifted the discourse beyond debate,
anecdotes to data interpretation
 but generalizations difficult – findings
should be treated as hypotheses
© 2008 Tefko Saracevic
Relevance
46
… epilogue: reflections
Relevance is poor

no funding for relevance research
 of studies with data less than 17% mentioned outside
funding, half from outside the US
 scholarship progressed sporadically & all over the
place
Globalization of IR – globalization of
relevance

as relevance went global & to the masses many
& different research questions emerged
© 2008 Tefko Saracevic
Relevance
47
… epilogue: reflections…
Proprietary IR – proprietary relevance

major search engines proprietary =
relevance proprietary
 for innovation must study users & use, but
findings kept private
 relevance research into public & private branch
 paradox of the internet
© 2008 Tefko Saracevic
Relevance
48
… epilogue: research
agenda - beyonds
Beyond behaviorism & black box


many studies stimulus-response & no
diagnostics
need to go use/adapt other approaches
© 2008 Tefko Saracevic
Relevance
49
… epilogue: research
agenda - beyonds
Beyond mantra

“implication for system design”
 incorporating user concerns &
characteristics not that simple
 integration between user/cognitive &
systems approaches needed
 relevance research and IR research should
at least get engaged, if not married
 interactive research on the right track
© 2008 Tefko Saracevic
Relevance
50
… epilogue: research
agenda - beyonds
Beyond students

~70% of behavior & effect studies used
students as population
 not surprising – they are affordable
 we know a lot about student relevance
 does it generalize to other populations?
© 2008 Tefko Saracevic
Relevance
51
In conclusion
Information technology & systems
will change dramatically


even in the short run
and in unforeseeable directions
But relevance is here to stay!
© 2008 Tefko Saracevic
Relevance
52
© 2008 Tefko Saracevic
Relevance
53
© 2008 Tefko Saracevic
Relevance
54