Transcript PPT

Hidden Concept Detection in Graph-Based Ranking
Algorithm for Personalized Recommendation
Nan Li, Computer Science Department, Carnegie Mellon University
Introduction
Hidden Concept Detector (HCD)
Previous work:
 Represents past user behavior through a relational graph.
Fail to represent individual differences among items of a same type.
Our work:
 Detect hidden concepts embedded in the original graph
 Build a two-level type hierarchy for explicit representation of item characteristics.
Two-Layer PRA
Background
Relational Retrieval
1.Entity-Relation Graph G=(E, T, R):
• Entity set E={e} Entity types set T={T} Entity relations R={R}
• Each entity e in E has its type e.T . Each relation R has two entity types R.T1 and R.T2. If two entities
has relation R, then R(e1, e2) = 1, o/w 0.
2.Relational Retrieval Task: Query q = (Eq , Tq)
• Given Eq = {e’}, predict the relevance of each entity e of the target type Tq.
Path Ranking Algorithm
1.Relational Path: P = (R1, R2, …, Rn) R1.T1=T0 and Ri.T2=Ri+1.T1.
2.Relational Path Probability Distribution:
• The probability corresponds to the probability of a path random walker reaching that entity from a
query entity.
1.PRA Model: (G, l, θ)
• The feature matrix A has its each column to be the distribution hp(e).
• The scoring function:
Find hidden subtype of
relations
author
title
paper
gene
autho
r
journal
year
title
Experiment Result
Data Sets: Saccharomyces Genome Database, a publication data set about the yeast organism
Saccharomyces cerevisiae
Three measurements:
• Mean Reciprocal Rank (MRR): inverse of the rank of the first correct answer
• Mean Average Precision (MAP): the area under the Precision-Recall curve
• p@K: precision at K, where K is the actually number of relevant entities.
Experiment Using Normalized Cut
• Training data: Number of clusters ↑
Recommendation quality↑
• Test data: NCut outperforms random
Experiment Using HCD
•Training data: HCD outperforms PRA in all three measurements
•Test data: Two systems perform equally well
gene
journal
year
Bottom-Up HCD
Bottom-Up merging algorithm:
For each relation type Ri
 Step1: Divide every starting node of relation Ri as a subrelation Rij.
author
paper
author
paper
 Step2: HAC: Each time merge two subrelations Rim and Rin to maximize the gain of objective functions
until no positive gain:
author
paper
author
Approximate the gain of objective function:
 Calculate the maximum gain of two relations: gm and gn
 Use taylor series to approximate:
2.Training PRA Model: (G, l, θ)
• Training data: D = {(q(m),y(m))}, ye(m)=1 if e is relevant to the query q(m)
• Parameter: The weight of path θ
• Objective function:
paper
paper