DivRank: Interplay of Prestige and Diversity in Information Networks Qiaozhu Mei1,2, Jian Guo3, Dragomir Radev1,2 1.School of Information 2.Computer Science and Engineering 3.

Download Report

Transcript DivRank: Interplay of Prestige and Diversity in Information Networks Qiaozhu Mei1,2, Jian Guo3, Dragomir Radev1,2 1.School of Information 2.Computer Science and Engineering 3.

DivRank: Interplay of Prestige and
Diversity in Information Networks
Qiaozhu Mei1,2, Jian Guo3, Dragomir Radev1,2
1.School of Information
2.Computer Science and Engineering
3. Department of Statistics
University of Michigan
2010 © University of Michigan
1
Diversity in Ranking
Ranking papers, people, web
pages, movies, restaurants…
Web search; ads;
recommender systems …
Network based ranking – centrality/prestige
2010 © University of Michigan
2
Ranking by Random Walks
b
d
a
c
Ranking using
stationary distribution
E.g., PageRank
pT 1 (v) 
 p(u, v) p
( u ,v )E
T
(u)
?
2010 © University of Michigan
3
Reinforcements in Random Walks
• Random walks are not random - rich gets richer;
– e.g., civilization/immigration – big cities attract larger population;
– Tourism – busy restaurants attract more visitors;
Conformity!
Source - http://www.resettlementagency.co.uk/modern-world-migration/
2010 © University of Michigan
4
Vertex-Reinforced Random Walk
(Pemantle 92)
b
a
d
transition probabilities
change over time
c
pT 1 (v) 
Reinforced random walk: transition
probability is reinforced by the weight
(number of visits) of the target state
2010 © University of Michigan
p
( u ,v )E
T
(u, v) pT (u)
pT (u, v)  NT (v)
5
DivRank
• A smoothed version of Vertex-reinforced Random Walk
pT (u, v)  (1   ) p* (v)   
b
a
“organic” transition
probability
Random jump, could
be personalized
c
p0 (u, v) NT (v)
DT (u )
• Adding self-links;
• Efficient approximations: use E[ NT (v)] to approximate NT (v)
Cumulative DivRank:
T
E[ NT (v)]   pt (v)
t 0
Pointwise DivRank:
E[ NT (v)]  pT (v)
2010 © University of Michigan
6
Experiments
• Three applications
– Ranking movie actors (in co-star network)
– Ranking authors/papers (in author/paper-citation network)
– Text summarization (ranking sentences)
• Evaluation metrics:
– diversity: density of subgraph; country coverage (actors)
– quality: h-index (authors); # citation (papers);
– quality + diversity: movie coverage (actors); impact
coverage (papers); ROUGE (text summarization)
2010 © University of Michigan
7
Results
• Divrank >> Grasshopper/MMR >> Pagerank
Paper citation:
Pagerank
Grasshopper
Density
Impact coverage
Divrank
Text Summarization:
2010 © University of Michigan
8
Why Does it Work?
• Rich gets richer
c
b
– Related to Polya’s urn and preferential attachment
• Compete for resource in neighborhood
a
b
Stay here or go to
neighbors?
– Prestigious node absorbs weights of its neighbors
• An optimization explanation
2010 © University of Michigan
9
Summary
• DivRank – Prestige/Centrality + Diversity
• Mathematical foundation: vertex-reinforced random walk
• Connections:
– Polya’s Urn
– Preferential Attachments
– Word burstiness
• Why it works?
– Rich-gets-richer
– Local resource competition
• Future work: Query dependent DivRank;
2010 © University of Michigan
10
Thanks!
2010 © University of Michigan
11