introducing serendipity into music recommendation

Download Report

Transcript introducing serendipity into music recommendation

Auralist: Introducing Serendipity (惊喜度)
into Music Recommendation
WSDM ’12
(ACM international conference on
Web Search and Data Mining)
1. INTRODUCTION
• The majority of research focuses on improving
the accuracy of recommendation
• The dangers
– 1. produces
boring and ineffective recommendations
– 2. harms
a user's personal growth and experience
1. INTRODUCTION
• Contributions:
• 1. Balance the conflicting goals
– accuracy
– diversity
– novelty
– serendipity
• 2. Use metrics to measure all three
non-accuracy factors simultaneously
1. INTRODUCTION
• Next…
• 2. Why accuracy is not enough(and the other
three properties)
• 3. The auralist framework(and algorithm)
• 4. Evaluation(Quantitative evaluation)
• 5. User study(Quantitative evaluation)
2. WHY ACCURACY IS NOT ENOUGH
• Firstly , the boring symbols
用户集
评分矩阵
user u 的Top-N推荐
物品集
user u 的验证集
user u 的训练集
item i 的热度
LDA topic 的集合
LDA 物品-话题 矩阵
歌手(物品) i 的
听众数目
2.1 ACCURACY
• average Top-N Recall
直观上:推荐列表的物品,在验证集中的数目(越小越好)
2.1 ACCURACY
• average Rank score
直观上:所推荐的物品,在验证集中的受喜爱程度
喜爱程度通过排名表示,所以越小越好
• 其中
2.1 ACCURACY
• Produce recommendations that appear
supercially “good”
• But are in fact inferior in terms of actual user
satisfaction
2. DIVERSITY(多样性)
直观上:位于推荐列表中所有物品,两两的余弦相似度(越小越好)
2. NOVELTY(新颖性)
直观上: higher values mean that more
globally “unexplored" items are being recommended
2. SERENDIPITY(惊喜度)
直观上:训练集的物品与推荐结果的物品
两两之间的相似度
(越小越好)
3. THE AURALIST FRAMEWORK
• Basic
– LDA model
• Hybrid
– Listener Diversity + Basic ---> Community-Aware
– Declustering + Basic ---> Bubble-Aware
• Full
3.1 BASIC AURALIST
• Using LDA ( Latent Dirichlet Allocation )
• 即:将原来文档中,向量空间的词的维度
转变为”Topic”的维度
3.1 BASIC AURALIST
• 举个栗子
• 一个文档A,包含“电脑”和“微机”这两个词。
• 将文档A向量化后可能是,“电脑”这个词是全部
词汇中的第2维,而“微机”是第3维。
• 维上的投影简单看作是其TF(文档中出现的次数)。
• A={x,1,1,x,...,x}
3.1 BASIC AURALIST
• 词的向量空间 A={x,1,1,x,...,x}
• 在向量空间中,“电脑 ”及“微机”这两个维度
被认为正交,即两个词表示了完全不同的意义。
• 将两个词的维度“捏合”为一个Topic的维度,词
在Topic中表示为权重。
• Topic的向量空间 A={y,(p1+p2),y,...,y}
• 降低了维度(好像很好的样子)
3.1 BASIC AURALIST
• Document:
TheWilliam Randolph Hearst Foundation will give $1.25 million to Lincoln Center,
Metropolitan Opera Co., New York Philharmonic and Juilliard School. “Our board
felt that we had a real opportunity to make a mark on the future of the performing
arts with these grants an act every bit as important as our traditional areas of
support in health, medical research, education and the social services,” Hearst
Foundation President Randolph A. Hearst said Monday in announcing the grants.
Lincoln Center’s share will be $200,000 for its new building, which will house young
artists and provide new public facilities. The Metropolitan Opera Co. and New York
Philharmonic will receive $400,000 each. The Juilliard School, where music and
the performing arts are taught, will get $250,000. The Hearst Foundation, a leading
supporter of the Lincoln Center Consolidated Corporate Fund, will make its usual
annual $100,000 donation, too.
3.1 BASIC AURALIST
• Words ---> Topics
3.1 BASIC AURALIST
3.1 BASIC AURALIST
• Artist-based LDA model
word
topic
document
user
usercommunity
artist
3.1 BASIC AURALIST
• similarity between artist topic vectors
• the score that user u associates to item I
– The LDA similarity used directly for item-based recommendation
• 对所有item的Basic值排序,得到推荐列表
3.2 Two hybrid versions of Auralist
• “A” that includes
– Artist-based LDA
– Listener Diversity
– Declustering
3.2.1 Community-Aware Auralist
• Listener Diversity(the entropy over its topic distribution)
• The Rank
• Give it some offset
• The offset
3.2.2 Bubble-Aware Auralist
• The rank
3.2.2 Bubble-Aware Auralist
• algorithm
4. EVALUATION
• 1. Basic Auralist
• 2. the state-of-the art:
– Implicit SVD(奇异值分解) method
• 3. Community-aware Auralist (λ1=0.05)
• 4. Bubble-aware Auralist (λ2=0.2)
• 5. Full Auralist
4.1 DATASET
• user.getTopArtists() from the Last.fm API
• Quantity:360k users
4.2 Basic Auralist Recommendation
4.2 Hybrid versions of Auralist
• Accuracy performance
4.2 Hybrid versions of Auralist
• Diversity, Novelty , Serendipity performance
5. USER STUDY
• Full Auralist
• λ1=0.03
• λ2=0.20
5.1 Experimental Method
• involved 21 participants
• included a mix of
– under/post graduates
– men/women
– between the ages of18-27
– varying nationalities
5.2 User Ratings
5.2 User Ratings
5.2 User Satisfaction
以下为测试者的言论,对Full Auralist各种赞扬
•
•
•
•
•
“[Full Auralist] was more satisfying because it bintroduced me to new artists.
[Basic] was lled entirely with new artists which, while very good, were things that I
listened to all the time on a regular basis. [Full Auralist] had artists that were of the
same quality of those I listen to but which I'd never heard of.“
“I found [the Full Auralist list] more surpris- ing than [Basic]. Most artists I had not
heard of (which is what I prefer). Listening to them gave me at least ve new artists
I could look into and use in the future.“
“While I enjoyed the songs on the [Full Au- ralist] list less, I liked that there was
more new music on it than the rst list. So I'm going to say that I preferred the [Full
Auralist] list.“
“[The Basic list was better], more familiar music & more my taste, although [Full
Auralist] introduced me to a few good bands.“
“[The Full Auralist list] was way too jazzy, and had very few artists I connected with
imme- diately. While [the Basic list] had a vast majority of artists I knew well and
have opinions of, the few unknowns were really very congenial."