www.fico.com

Download Report

Transcript www.fico.com

Improvements in
Recommendation Systems
Shafi Rahman
Director, Analytic Science
FICO
© 2014 Fair Isaac Corporation. Confidential.
This presentation is provided for the recipient only and cannot be reproduced or shared without Fair Isaac Corporation’s express consent.
Patent Pending
Forward-Looking Statements
Product roadmaps and similar marketing materials
should be considered forward looking and
subject to future change at FICO’s discretion.
Future functionality, features or enhancements as shown are
FICO’s current projections of the product direction,
but are not specific commitments or obligations.
2
© 2014 Fair Isaac Corporation. Confidential.
Increase customer engagement and value
by improving estimation of a user’s
preference for an item
3
© 2014 Fair Isaac Corporation. Confidential.
Three Years after My Child Was Born, I Continued to Get
Following Recommendations from an Online Retailer!
Irrelevant recommendations led to disconnect with the retailer
4
© 2014 Fair Isaac Corporation. Confidential.
On the Other Hand Following Recommendation Was Quite
Relevant at the Time I Got It from Another Retailer
And I actually ended up ordering it online!
5
© 2014 Fair Isaac Corporation. Confidential.
Relevant recommendation
=
Improved engagement
=
Increased customer value
6
© 2014 Fair Isaac Corporation. Confidential.
Relevant Recommendations Impact Many Businesses
Credit Card Issuers
Online Retailers
Recommendation Services
Movie/Music Streaming
…
7
© 2014 Fair Isaac Corporation. Confidential.
We Will Focus on
Act 1
Act 2
Act 3
8
© 2014 Fair Isaac Corporation. Confidential.
Using ratings data for predictions
Improvements in working with ratings data
Recommending without ratings data
Working with Ratings Data
Collaborative Filtering
Users
Items
9
1
2
3
4
…
u
1
1
3
4
2
2
4
3
3
2
5
1
4
1
5
5
…
i
Estimate missing ratings
4
3
To
Give top N recommendations
5
4
1
5
► User
Based Collaborative Filtering
► Item
Based Collaborative Filtering
© 2014 Fair Isaac Corporation. Confidential.
User Based Collaborative Filtering
Similar
Likes
Likes
Recommend
Mimics word of mouth approach
10
© 2014 Fair Isaac Corporation. Confidential.
Item Based Collaborative Filtering
Similar
Likes
Likes
Recommend
11
© 2014 Fair Isaac Corporation. Confidential.
Defining “Similarity”
Use Ratings Data
1
2
3
4
…
u
1
1
3
4
2
2
4
3
3
2
5
1
5
4
u2
4
1
5
5
…
1
Commonly used similarity measures
► Angular
similarity
► Magnitude
12
similarity
© 2014 Fair Isaac Corporation. Confidential.
i
4
3
u1
Item 2
Users
Items
u3
5
Item 1
How User Based Collaborative Filtering Works?
Users with Similar Preferences Will Rate Items Similarly
Load the ratings
matrix
Compute user
similarity matrix
based on similarity of
ratings by users
Aggregate item
ratings weighted by
user similarity
R
U
Rpredicted = U x R
Rating for user i, for item k
13
© 2014 Fair Isaac Corporation. Confidential.
𝑟′𝑖𝑘 =
𝑗 𝑠𝑖𝑗
∗ 𝑟𝑗𝑘
𝑗 𝑠𝑖𝑗
How Item Based Collaborative Filtering Works?
Users Prefer Items that Are Similar to Items They Already Like
Load the ratings
matrix
Compute item
similarity matrix
based on similarity of
ratings of items
Aggregate item
ratings weighted by
item similarity
R
I
Rpredicted = R x I
Rating for user i, for item k
14
© 2014 Fair Isaac Corporation. Confidential.
𝑟′𝑖𝑘 =
𝑗 𝑟𝑖𝑗
∗ 𝑠𝑗𝑘
𝑗 𝑠𝑗𝑘
Strengths and Weaknesses
User Based Collaborative Filtering
Recommendations are usually better than Item Based Approach
Large
memory
Expensive
computation
Stores user
similarity
matrix
15
© 2014 Fair Isaac Corporation. Confidential.
Large number
of user-pairs
Strengths and Weaknesses
Item Based Collaborative Filtering
Recommendations are usually weaker than User Based Approach
Smaller
memory
Cheaper
computation
Stores item
similarity
matrix
16
© 2014 Fair Isaac Corporation. Confidential.
Lesser
number of
item-pairs
Innovation #1: User Item Based Collaborative Filtering
Merging the Benefits and Nullifying the Weaknesses
Apply User Based
Collaborative Filtering
► R’
=UxR
Replace estimated
ratings with known
ratings
► Update
R’
Apply Item Based
Collaborative Filtering
17
© 2014 Fair Isaac Corporation. Confidential.
► R’’ =
R’ x I
Using User Item Based Collaborative Filtering
in Production
18
© 2014 Fair Isaac Corporation. Confidential.
Computing R’ is
expensive
Compute R’ once a
month
Updating R’ is
trivial
Update R’ with new
incoming rating
Computing R’’ is
cheaper
Compute R’’ in real
time
Impact on Recommendations
Actual numbers vary by domain and dataset
Improvement Over
19
© 2014 Fair Isaac Corporation. Confidential.
Reduction in RMSE
User based CF
5%
Item based CF
7%
Innovation #2: Improvements in Similarity Metrics
I2 Rating
U3
U2
Commonly used similarity measures
U1
► Angular
similarity
► Magnitude
I1 Rating
20
I1
I2
U1
2
2
U2
1
3
U3
4
4
© 2014 Fair Isaac Corporation. Confidential.
similarity
Measuring Angular Similarity
Angular similarity
s(u1, u3) = s(u3, u1) = cosine() = 1
I2 Rating
U3
U2
s(u1, u2) = s(u2, u1) = cosine() ~ 0.5
U1
I1 Rating
21
I1
I2
U1
2
2
U2
1
3
U3
4
4
© 2014 Fair Isaac Corporation. Confidential.
User gives
highest ratings
to every item
User gives
lowest ratings
to every item
Measuring Magnitude Similarity
Magnitude Similarity
m(u1, u3) = ||u1||/||u3|| = 0.5
I2 Rating
U3
U2
m(u3, u1) = ||u3||/||u1|| = 2
U1
m(u1, u2) = m(u2, u1) = 1
I1 Rating
22
I1
I2
U1
2
2
U2
1
3
U3
4
4
© 2014 Fair Isaac Corporation. Confidential.
Distinction between exactly
opposite ratings and high
difference in few but same
in remaining is lost
New Approach: A Combination of Angular and
Magnitude Similarities
► Predicted
𝑟′𝑖𝑘 =
rating for user i, for item k, while j varies over nearest neighbors:
𝑗 𝑠𝑖𝑗
∗ 𝑚𝑖𝑗 ∗ 𝑟𝑗𝑘
𝑗 𝑠𝑖𝑗 ∗ 𝑚𝑖𝑗
Identify nearest neighbors using only s
23
© 2014 Fair Isaac Corporation. Confidential.
m = magnitude similarity
s = angular similarity
r = rating
Impact on Recommendations
Combination of Innovation #1 and #2
Actual numbers vary by domain and dataset
RMSE on Test Data
1.05
1.00
Combined
Improvement Over
0.95
User based CF
6%
Item based CF
8%
0.90
User Based Item Based User Item
Based
24
© 2014 Fair Isaac Corporation. Confidential.
User Item
Based +
new
similarity
Reduction in
RMSE
Reality Check
► Most
► But
clients don’t have ratings data
significant interaction history usually available
Purchase history
Clickstream
Streaming services
25
© 2014 Fair Isaac Corporation. Confidential.
► Books
► Household
► Page
items
views
► Movies
watched
► Music listened
Inspiration
Recommend a Book Using Themes
Divergent
Recommend
Hunger Games
Mostly
Dystopian,
some catering to
Young Adult
Nineteen Eightfour
26
© 2014 Fair Isaac Corporation. Confidential.
A Clockwork
Orange
The Giver
More Information Improves Recommendation
Divergent
Hunger Games
Nineteen
Eight-four
A Clockwork
Orange
Fahrenheit 451
27
© 2014 Fair Isaac Corporation. Confidential.
Recommend
Dystopian, Mostly
Political, catering
to Young Adult
Brave New
World
Challenges with This Approach
Is this expert list of themes
complete/comprehensive?
Adventure
Autobiography
Children's literature
Coming of age
Drama
Dystopian
Fantasy
Fiction
Political fiction
Satire
Social science fiction
Southern Gothic
War
28
© 2014 Fair Isaac Corporation. Confidential.
Is the expert assignment of themes
correct/complete?
Southern
Gothic?
Coming of
age?
To Kill A
Mocking Bird
Fiction?
?
What We Need?
A data driven way to identify
underlying themes
Assign each book to the
identified themes
Compute reader’s themes based
on books read
Recommend books closest to
the user’s themes
29
© 2014 Fair Isaac Corporation. Confidential.
One Minute Guide to Topic Models (1/2)
Topic Model discovers latent "topics" that occur in a collection of documents
I love my cat. It loves to
drink milk. Milk keeps
her healthy.
Topic 1
Pets
My dog is the best dog
in the world. I feed him
the best dog food.
Topic 2
► Cat
► Milk
► Dog
► Dog
► Pet
► Fruit
Food
Fruits and vegetables
are necessary for getting
sufficient mineral and
vitamin.
Topic 3
► Healthy
food
► Mineral
and vitamin
► Nutritional supplement
► Vegetable
No human intuition is required to determine these topics
30
© 2014 Fair Isaac Corporation. Confidential.
Health
One Minute Guide to Topic Models (2/2)
Pets
Topic 1
Food
Topic 2
► Cat
► Milk
► Dog
► Dog
► Pet
► Fruit
Topic 3
► Healthy
food
► Mineral
and vitamin
► Nutritional supplement
► Vegetable
Topic 1 (Pets)
I love my dog. He hates
drinking milk.
Topic 2 (Food)
No human intuition is used to assign the topics
31
© 2014 Fair Isaac Corporation. Confidential.
Health
Step 1: Identify Data Driven Themes
Use Topic
Modeling to
discover Themes
Read each book
To Kill A Mocking Bird
Nineteen Eighty-four
The Lord of the Rings
The Catcher in the Rye
The Great Gatsby
Harry Potter & Sorcerer's Stone
The Diary of a Young Girl
Animal Farm
word1 word2
wordN
Theme 1
Theme 2
Theme 3
Theme 4
Theme 5
Theme 6
Theme 7
Theme = Topic
No human intuition is required to determine these themes
32
© 2014 Fair Isaac Corporation. Confidential.
Step 2: Assign Data Driven Themes to Books
Assign books to
each Theme
Read each book
Theme
1
Theme
2
Theme
3
Theme
4
Theme
5
Theme
6
Theme
7
To Kill A Mocking Bird
Nineteen Eighty-four
The Lord of the Rings
The Catcher in the Rye
The Great Gatsby
Harry Potter& Sorcerer's Stone
The Diary of a Young Girl
Animal Farm
No human intuition is used to assign the themes
33
© 2014 Fair Isaac Corporation. Confidential.
Step 3: Derive Themes of a User from Books Read
Divergent
Hunger Games
Theme1 Theme2 Theme3 Theme4 Theme5 Theme6 Theme7
Divergent
Nineteen
Eight-four
A Clockwork
Orange
Nineteen Eighty-four
Fahrenheit 451
A Clockwork Orange
Hunger Games
User
Fahrenheit 451
34
© 2014 Fair Isaac Corporation. Confidential.
Both books and users are
described using same Themes
Step 4: Assign A Book to User by Comparing Themes
Books most similar to user themes
Recommend
Theme1 Theme2 Theme3 Theme4 Theme5 Theme6 Theme7
Mocking Jay
Unwind
Insurgent
Theme
1
Theme
2
Theme
3
User
35
© 2014 Fair Isaac Corporation. Confidential.
Theme
4
Theme
5
Theme
6
Theme
7
Similarity is based on distance
between user and books in
terms of their themes
Use Case 2: Recipe Recommendation
►Benefits
► Increase
customer foot fall
► Encourage discovery
►Apply
approach similar to book recommendation
using textual description of items doesn’t
work
►But,
“apple nut bread” is bread, not apple or nuts
“orange winter squash” is a squash, not orange
“lime peel” is lime, not peel
36
© 2014 Fair Isaac Corporation. Confidential.
Recipe Recommendation Using Purchase Tx Data
Identify latent preference groups
based on ingredients purchased by
customers
Assign each customer to the
identified latent preferences
Compute likelihood of buying an
ingredient given customer’s latent
preferences
Compute likelihood of liking a recipe
given likelihood of buying
ingredients
37
© 2014 Fair Isaac Corporation. Confidential.
Discern Customer Preferences from Purchase Tx Data
► Examples
of latent preference groups described in terms of ingredients
purchased by customers
“Asian stir fry” group
► Ethnic
Greens
► Bok Choy
► Gingerroot
► Green Onions
► Tofu
► If
0.26
0.23
0.11
0.08
0.04
“Vegetarian ” group
► Tofu
► Deli
Meat Alternative
► Veggie Burger
► Vegetarian Cheese
► Vegetarian
0.25
0.22
0.12
0.08
0.03
prior history had high “Asian stir fry” allocation, then “Tofu” reinforces it
prior history had high “Vegetarian” allocation, then “Tofu” reinforces different
archetype
► If
38
© 2014 Fair Isaac Corporation. Confidential.
Compute Likelihood of Buying an Ingredient
“Asian stir fry” group
► Ethnic
Greens
► Bok Choy
► Gingerroot
► Green Onions
► Tofu
Customer
Amy
Bob
Carol
Dave
0.26
0.23
0.11
0.08
0.04
“Vegetarian” group
► Tofu
► Deli
Meat Alternative
► Veggie Burger
► Vegetarian Cheese
► Vegetarian
Asian stir fry
Vegetarian
P(Tofu)
Low
High
Low
High
Low
Low
High
High
Low
Medium-Low
Medium-High
High
Customers need not have bought Tofu in the past, as long as
their preference group is inferred from their other purchases
39
© 2014 Fair Isaac Corporation. Confidential.
0.25
0.22
0.12
0.08
0.03
Feedback on Results
► A Blind
Test
► Generated
recipe recommendation for 1000 customers
► Some of them were top management of the client—not known to us
► Only
the top management of the client received the recommendations
They loved the recommendations!
► Large
40
Scale Trial Approved and Now Underway
© 2014 Fair Isaac Corporation. Confidential.
Benefits of Data Driven Latent Theme/Preference
Approach
41
© 2014 Fair Isaac Corporation. Confidential.
No explicit ratings
data needed
Uses Tx data to
extract
themes/preferences
Empirical - no
human input
Scales to large
number of users and
items
Summary of Our Recommendation Systems
With ratings
data
Without
ratings data
42
© 2014 Fair Isaac Corporation. Confidential.
► More
predictive
► Computationally optimized
► Relevant
recommendations
without explicit
ratings/preference
► Great initial customer feedback
43
Thank You!
Shafi Rahman
[email protected]
+1 (858) 353-8280
+91 (804) 137-1768
© 2014 Fair Isaac Corporation. Confidential.
This presentation is provided for the recipient only and cannot be reproduced or shared without Fair Isaac Corporation’s express consent.
Please rate this session online!
Shafi Rahman
[email protected]
44
© 2014 Fair Isaac Corporation. Confidential.