LARS-A Location-Aware Recommender System

Download Report

Transcript LARS-A Location-Aware Recommender System

LARS : A Location-Aware Recommender System
ICDE ‘12
1
1. Introduction
• Traditional recommender systems
– triple(user, rating, item)
– (user id U) + (limit K)
• return K recommended items to U
• Locations
– destinations check-in(Facebook, Foursquare)
– user zip code(MovieLens)
2
1.1 Motivation: A Study of Location-Based Ratings
• Preference locality
3
1.1 Motivation: A Study of Location-Based Ratings
• Travel locality
4
1.2 LARS - A Location-Aware Recommender
• (user, ulocation, rating, item)
• (user, rating, item, ilocation)
• (user,ulocation, rating, item, ilocation)
5
1.2 LARS - A Location-Aware Recommender
•
•
•
•
•
•
next……
2. an overview of LARS
3. spatial user ratings for non-spatial items
4. non-spatial user ratings for spatial items
5. spatial user ratings for spatial items
6. experimental analysis
6
2.1 LARS Query Model
• (user id U) + (limit K) + (location L)
——> return K recommended items to U
• query :
– snapshot (one-time) queries
– continuous queries
7
2.2 Item-Based Collaborative Filtering
• Phase I: Model Building
– 计算item间的相似度sim
– 对于每个item
• 模型只会存储前n个相似度最高的sim值
• n为user个数
• Phase II: Recommendation Generation
8
3 Spatial User Ratings For Non-spatial Items
• (user, ulocation, rating, item)
• requirements
– Locality(局部性):能对地点感知
– Scalability(可扩展性):能够对大量的用户进行
运算
– Influence():用户能够改变感知的区域大小
9
3.1 Data Structure
• partial pyramid structure—局部锥形结构
10
3.1 Data Structure
天朝
Level 0
Level 1
南
Level 2
Level 3
广东
珠三角
粤东
广西
粤西
北
福建
海南
中
西
四川
云南
西藏
青海
粤北
11
3.2 Query Processing
• query processing steps
– 1.从最底层找起
– 2.如果没找到
• 去上一层找
– 3.直到找到为止
12
3.2 Query Processing
• Continuous queries
– (一边移动&一边查询)
– 1.如果没有离开上一次查询时所在的grid
• 还是原来熟悉的结果
– 2.否则
• 去上一层找,找到为止
13
3.3 Data Structure Maintenance
• 当有new users,ratings,items时
• Trigger: N% (才会启动Maintenance)
• The maintenance will be amortized(均摊)
• Step I: Model Rebuild
• Step II: Merging(合并)/Splitting(分裂)
Maintenance
14
3.3.1 Cell Merging
• Impoves scalability
– storage
• less CF models size(只需储存于高层,底层不储存)
• 主要标准
– computational overhead
• less maintenance computation维护次数减少
• less continuous query processing computation
• 次要标准
• Hurts locality
15
3.3.1 Cell Merging
• Two percentage values
– locality_loss
– scalability_gain
• A system parameter M∈ [0,1]
• Merges if:
– M越小,则越倾向于合并
16
3.3.1 Cell Merging
• Calculating locality_Loss
– Sample
– Compare
17
3.3.1 Cell Merging
• Calculating scalability_gain
– ( child cells ) / ( child cells + parent cell )
– st
• 还是举之前的栗子
– scalability_gain
• 4 child cells == 2GB
• parent cell == 2GB
• scalability_gain=50%
18
3.3.1 Cell Merging
• locality_loss=25%
• scalability_gain=50%
• Assuming M=0.7
•
• but (0.3*50%)<(0.7*25%)
• will not merge
19
3.3.2 Cell Splitting
• 其效用与Cell Merging相反
– Improves locality
– Hurts scalability
• 计算与Cell Merging基本相同
– locality_gain
– scalability_loss
Merging
Splitting
locality_loss
locality_gain
scalability_gain
scalability_loss
20
4 Non-spatial User Ratings For Spatial Items
• (user, rating, item, ilocation)
• travel locality
• travel penalty
–
– expensive computational overhead
– so,employs “early termination”
21
4.1 Query Processing
• Algorithm
– 1.找出全部item中,TravelPenalty最小的k个item,将k个item按照
RecScore从大到小排序,形成表R
– 2.设LowestRecScore为R中最小的(也就是第K个) RecScore值
– 3.找出剩余item中TravelPenalty最小的item
• 4.设 MaxPossibleScore = MAX_RATING – TravelPenalty
• 5.IF MaxPossibleScore <= LowestRecScore
– 6.不再找了,直接return R
• 7. 算出此item的RecScore
• 8.IF RecScore > LowestRecScore
– RecScore替换LowestRecScore 进入R
– 重新找一个LowestRecScore
– 回到3
22
4.2 Incremental Travel Penalty Computation
• Incremental KNN
– online
– Exact
– Expensive
• Penalty Grid
– Offline
– Less exact
– Efficient
23
5 Spatial User Ratings For Spatial Items
• (user, ulocation, rating, item, ilocation)
• user partitioning & travel penalty
– can be used together
– with very little change
24
6 Experiment
• test recommendation quality
– Foursquare:real dataset
– MovieLens:real dataset
• test scalability and query efficiency
– Synthetic: synthetically generated dataset
25
6 Experiment
•
•
•
•
CF: item-based collaborative filtering
LARS-T: LARS with only travel penalty
LARS-U: LARS with only user partitioning
LARS: LARS with both techniques
• default parameter
– M == 0.3
– k == 10
– the number of pyramid levels (h) == 8
26
6.1 Recommendation Quality
for Varying Pyramid Levels
• 80%训练,20%验证:
• Measure(Quality)
– 统计预测的推荐结果 进入真实评分前k(默认k=10) 的次数
• 层次分太细,每个grid中rating太少
27
6.2 Recommendation Quality
for Varying Values of k
28
6.3 Storage Vs. Locality
Note:M越小 ,越倾向于合并
M越大,越倾向于分裂
29
6.4 Scalability
存
储
大
小
平
均
维
护
时
间
Default:M=0.3
LARS is acceptable.
30
6.5 Query Processing Performance
响
应
时
间
单次查询:
LARS vs LARS-U
LARS vs LARS-T
通过对比可以发现
之前两种技术所带来的时间上的优势
平
均
响
应
时
间
连续查询:
CF最快(那当然了-_-#)
除此之外,LARS最快
31