ML Group Director Review

Download Report

Transcript ML Group Director Review

TransRank: A Novel Algorithm for
Transfer of Rank Learning
Depin Chen, Jun Yan, Gang Wang et al.
University of Science and Technology of China, USTC
Machine Learning Group, MSRA
[email protected]
2008-12-15
Page 1
Content
• Ranking for IR
• Paper motivation
• The algorithm: TransRank
• Results & future work
2008-12-15
Page 2
Ranking in IR
• Ranking is crucial in information retrieval. It aims to
move the good results up, while the bad down.
• A well known example: web search engine
2008-12-15
Page 3
Learning to rank
• Ranking + Machine learning = Learning to rank
• An early work
Ranking SVM, “Support Vector Learning for Ordinal
Regression” , Herbrich et al [ICANN 99].
2008-12-15
Page 4
Learning to rank for IR
2008-12-15
Page 5
Existing approaches
• Early ones
Ranking SVM, RankBoost …
• Recently
IRSVM, AdaRank, ListNet ...
• Tie-Yan Liu’s team at MSRA
2008-12-15
Page 6
Content
• Learning to rank in IR
• Paper motivation
• The algorithm: TransRank
• Results & future work
2008-12-15
Page 7
Training data shortage
• Learning to rank relies on the full supply of labeled
training data.
• In real world practice …
• Labeling data is expensive  lack of training data
Label data
Learn the
model
• Bad generalization ability
• Poor performance
operation
2008-12-15
Page 8
Transfer learning
• Transfer learning definition
Transfer knowledge learned from different but related
problems to solve current problem effectively, with fewer
training data and less time [Yang, 2008].
– Learning to walk can help learn to run
– learning to program with C++ can help learn to program with
JAVA
– …
• We follow the spirit of transfer learning in this paper.
2008-12-15
Page 9
Content
• Learning to rank in IR
• Paper motivation
• The algorithm: TransRank
• Results & future work
2008-12-15
Page 10
Problem formulation
• St: training data in target domain
Ss: auxiliary training data from a source domain
• Note that,
• What we want?
A ranking function for the target domain
2008-12-15
Page 11
TransRank
• Three steps of TransRank:
Step 1:
K-best query selection
Step 2:
Feature augmentation
Step 3:
Ranking SVM
2008-12-15
Page 12
Step 1: K-best query selection
• Query’s ranking direction
query 11 in OHSUMED
2008-12-15
query 41 in OHSUMED
Page 13
• The goal of step 1:
We want to select the queries from source domain who
have the most similar ranking directions with the target
domain data.
• These queries are treated to be most like the target
domain training data.
2015/7/21
Microsoft Confidential
Page 14
Utility function (1)
• Preprocess Ss:
select k best queries, and discard the rest.
• A “best” query is the query, whose ranking direction is
confidently similar with that of queries in St.
• The utility function combines two parts: confidence and
similarity.
2008-12-15
Page 15
Utility function (2)
• Confidence is valued using a separation value. The
better different classes of instances are separated, the
ranking direction will be more confident.
2008-12-15
Page 16
Utility function (3)
• Cosine similarity.
2008-12-15
Page 17
Step 2: Feature augmentation
• Daumé implemented cross-domain classification in NLP
through a method called “feature augmentation” [ACL
07] .
• For source-domain document vector (1, 2, 3)
(1, 2, 3)(1, 2, 3, 1, 2, 3, 0, 0, 0)
• For target-domain document vector (1, 2, 3)
(1, 2, 3)(1, 2, 3, 0, 0, 0, 1, 2, 3)
2008-12-15
Page 18
Step 3: Ranking SVM
• Ranking SVM is the state-of-the-art learning to rank
algorithm, proposed by Herbrich et al [ICANN 99].
2008-12-15
Page 19
Content
• Learning to rank in IR
• Paper motivation
• The heuristic algorithm: TransRank
• Results & future work
2008-12-15
Page 20
Experimental settings
• Datasets: OHSUMED (the LETOR version), WSJ, AP
• Features: feature set defined in OHSUMED. Same
features are abstracted on WSJ and AP
• Evaluation measures: NDCG@n, MAP
• For Ranking SVM, we use SVMlight by Joachims.
• Two group of experiments
2008-12-15
WSJ
OHSUMED
AP
OHSUMED
Page 21
Compared algorithms
• Baseline: run Ranking SVM on St
• TransRank
• Directly Mix: Step 1 + Step3
2015/7/21
Microsoft Confidential
Page 22
Performance comparison
40% of target labeled data, k=10
source domain: WSJ
source domain: AP
0.44
0.45
0.43
0.44
0.42
0.43
0.41
Baseline
TransRank
0.4
0.42
Baseline
TransRank
0.41
Directly Mix
Directly Mix
0.39
0.4
0.38
0.39
0.37
0.38
MAP
2008-12-15
NDCG@1
NDCG@3
NDCG@5
NDCG@10
MAP
NDCG@1
NDCG@3
NDCG@5
NDCG@10
Page 23
Impact of target labeled data
• From 5% to 100%, k=10
source domain: WSJ
2008-12-15
source domain: AP
Page 24
Impact of k
40% of target labeled data
2008-12-15
Page 25
Future work
• Web scale experiments, i.e. data from search engines
• More integrated algorithm using machine learning
techniques
• Theoretical study for transfer of rank learning
2008-12-15
Page 26
Q&A
2008-12-15
Page 27
Thanks!
2008-12-15
Page 28