4slides/CS466-Lecture-XXV.ppt

Download Report

Transcript 4slides/CS466-Lecture-XXV.ppt

Future Direction #3: Collaborative Filtering
Motivating Observations:
Relevance Feedback is useful, but expensive
a) Humans don’t have time to give positive/negative
judgements on a long list of returned web pages to improve
search
b) Effort is used once, then wasted
want pooling of efforts among individuals and reuse
Collaborative Filtering
Motivating Observations (continued)
2) Relevance Quality
Query:
bootleg CD’s
Medical School Admissions
REM
NAFTA
Simulated Annealing
Alzheimer’s
Many web pages can be “about” a topic (specialized unit)
But there are great differences in quality of presentation, detail, professionalism,
substance, etc.
Possible Solution: build a supervised learnerfor quality/ NOT topic matter
Train on examples of each, learn distinguishing properties
Supervised Learner for “Quality” of a Page
P(Quality|Features)  independent of topic similarity
salient features may include:
•# of links
•Size
•How often cited
•Variety of content
•“Top 5th of Web” etc,
•assessment of usage counter (hit count)
•Complexity of graphics  quality??
•Prior quality rating of server
Collaborative Filtering
Problem: Different humans have different profiles of
relevance/quality
Appropriate for Care Giver
Query: Alzheimer’s disease
Relevant
(High
Quality)
for 6th
Grader
Medical
Researcher
= A document or web page
One Solution:
Pool Collective Wisdom and Compute weighted average of:
ranking(pagej, Queryi)
across multiple users (taking into account relevance,
quality, and other intangibles
However: humans have a better idea than machines of what
other humans will find interesting
Collaborative Filtering
Idea: instead of trying to model (often intangible) quality
judgments, keep a record of previous human relevance and
quality judgments
Query: Alzheimer’s
Users
3
Table of user
rankings of web
pages for a
query
1
5
1
3
Web
pages
1
4
1
1
3
3
2
3
2
4
4
6
2
2
1
2
7
2
1
5
Solution 1:
Identify individual with similar tastes (High Pearson’s
coefficient on similar ranking judgments)
instead of:
P(relevant to me | Pagei content)
compute:
P(relevant to me | relevant to you)  My similarity to you
* P(relevant to you | Pagei content)  Your Judgments
Solution 2:
Model Group Profiles for relevance judgments (e.g. Junior
High School vs. Medical Researchers)
compute:
P(relevant to me | relevant to groupg)  My similarity to
the group
* P(relevant to groupg | Pagei content)  group’s
collective (avg)
relevance
judgments
Supervised Learning