Transcript PPTX

Personalizing Search
on Shared Devices
Ryen White and Ahmed Hassan Awadallah
Microsoft Research, USA
Contact: [email protected]
Shared Device Search
• 2011 Census: 75% of U.S.
households have computer
• In most homes that machine is
shared between multiple people
• Search engines use machine
identifiers based on cookies, ids, etc.
• Assumes 1:1 mapping from IDs to
people for analysis and personalization
Shared devices in households
• Attributing activity to people (not machines) may improve personalization
• Some early indications of effectiveness in prior work (Singla et al., 2014)
Is Shared Device Searching Common?
• Analyzed comScore search data (all engines, en-US)
• Both machine identifiers and person identifiers (users self-identify)
Multi-user
(66%)
Variations in % machine
ids = multi-user with
different profile sizes
• 6 months = 66%
• 3 months = 57%
• 1 month = 44%
• Aside: Within-session shared device search less common: 97% sessions = single user
Handling Shared Device Use
• Limited current solutions in search engines (user sign-in)
• However: Requires user effort to sign in, People don’t sign out so their signals mixed
• Some solutions in other domains, e.g., streaming media
• Ideally this would happen automatically without user needing to explicitly log in
• Search activity attribution methods can help with this …
Activity Attribution Challenge
• Given a stream of data from a machine identifier, attribute observed historic
and new behavior to the correct person
History of search activity on machine
User 1
User 2
User 1
User 3
New query
Which user?
{k user clusters}
• Related work in signal processing and fraud detection
• Applications for: Personalization, Advertising, Privacy protection
• Question: What is upper bound on gain from attribution-based methods?
• We perform ORACLE study with perfect knowledge of who is searching
“From devices to people: Attribution of search activity in multi-user settings” White et al., WWW2014
Key Contributions
• Introduce attribution-based personalization (ABP) and
estimate its value in ORACLE STUDY (perfect knowledge of who is searching)
• Show machine vs. person is meaningful for an important application:
predicting searchers’ future interests
• Identify properties of interest models and queries for which ABP is best
• Learn model to predict when to apply ABP on a per-query basis
Key Contributions
• Introduce attribution-based personalization (ABP) and
estimate its value in ORACLE STUDY (perfect knowledge of who is searching)
• Show machine vs. person is meaningful for an important application:
predicting searchers’ future interests
• Identify properties of interest models and queries for which ABP is best
• Learn model to predict when to apply ABP on a per-query basis
Attribution-Based Personalization (ABP)
Three phases:
• Activity attribution and interest model construction for individuals from historic activity
• Attribution of newly-observed activity to the correct searcher
• Application of that searcher’s specific interest model for personalization
Building Interest Models
• Build machine and person interest profiles based on the ODP hierarchy
• Use result clicks
• Category distributions can differ
between people and machines, e.g.,
• Sports/Tennis largest in machine,
but only highest for one searcher (B)
• Some topics have broad interest, e.g.,
all searchers are interested in movies
 Individualized models could matter
• Question is how much and when do they matter most and least?
Key Contributions
• Introduce attribution-based personalization (ABP) and
estimate its value in ORACLE STUDY (perfect knowledge of who is searching)
• Show machine vs. person is meaningful for an important application:
predicting searchers’ future interests
• Identify properties of interest models and queries for which ABP is best
• Learn model to predict when to apply ABP on a per-query basis
Dataset
• Two years of comScore logs
• Divided into two subsets:
Per machine or person:
6 months
Interest Model Building
1 month
Evaluation
• Model building: 6mo of comScore search logs for model building (Jan13 - Jun13)
• Evaluation: 1mo immediately following to evaluate predictions (Jul13)
• Result clicks from each person/machine
used to construct interest models
• Machine click thresholds:
• MODEL BUILDING: ≥ least 100 clicks
• EVALUATION: ≥ 15 clicks
Time
Prediction Task
• Given a query and interest model, predict ODP categories of next click
• Vary identifier type and match type
• Identifier type: Machine- or person-based
• Match type: All historic activity or on-task activity only
Match type
Identifier type
Machine-based Person-based
a
b
All activity
On-task activity c
d
• On-task search activity: On-task historic activity as clicks associated with queries
with at least one non-stopword term in common with current query
• On-task models more accurately reflects state-of-the-art in personalization
(Bennett et al. SIGIR12; Teevan et al. WSDM11)
Prediction Task: Evaluation Metrics
• Precision (P): Did the top predicted label == actual label (1 or 0)?
• Recall (R): Did actual label appear in prediction?
• F1 score: Harmonic mean of P and R
• Reciprocal Rank: If actual label == predicted label, the score assigned was the
reciprocal of the prediction rank position 1 ⁄ r, and 0 otherwise
• Averaged over all queries in evaluation dataset
Evaluation Method
Given our evaluation set (𝑄)  {timestamp, machine identifier,
person identifier, query, {result clicks}} for each query (𝑞) in 𝑄:
For each identifier type in {machine, person}:
For each match type in {all, on-task}:
For each 𝑞 ∈ 𝑄:
If identifier type = machine:
If match type = all:
Obtain all historic
queries from the
machine from the
model building
dataset
If identifier type = person:
If match type = on-task:
Find all historic queries
from machine with ≥ 1
non-stopword terms in
common with 𝑞 in the
model building data
If match type = all:
Obtain all historic
queries from the
searcher from the
model building
dataset
If match type = on-task:
Find all historic queries
from searcher with ≥ 1
non-stopword terms in
common with 𝑞 in the
model building data
• Get clicked results for each of the queries and assign ODP categories to the clicked results
• Build an interest model (𝑢) comprising the normalized distribution of ODP categories from the assignment
• Select top-weighted predicted label in 𝑢, denoted 𝑝𝑙1
• Compute the effectiveness of the method in relation to the ground truth
• Average metric values for matchtype across all 𝑞 ∈ 𝑄 to compute the overall performance metrics
Prediction Results
• Focus on machines w/ 2+ users in the rest of our analysis
• Shared device searching is predictable accurately (White et al., WWW14)
• Machine-based models are our
baselines for each of the two
match types
• Gains in precision, F1, and RR
• 11-15% in overall perf.
• 19-43% for on-task perf.
• Focus on F1 for remainder of analysis
Recall slightly higher for machine
 Machine-based models are a
superset of the person-based models
Key Contributions
• Introduce attribution-based personalization (ABP) and
estimate its value in ORACLE STUDY (perfect knowledge of who is searching)
• Show machine vs. person is meaningful for an important application:
predicting searchers’ future interests
• Identify properties of interest models and queries for which ABP is best
• Learn model to predict when to apply ABP on a per-query basis
Impact of Additional Factors
• Properties of the interest models and query can influence utility of ABP
• Model Properties
• Model entropy: Entropy of the interest model (low, medium, high)
• Relative model size: Fraction of machine-based model
• Number of searchers on machine
• Query Properties
• Click entropy: Diversity of clicks (low, medium, high)
• Popularity: Frequency of query (low, medium, high)
• Topic: Top-level ODP category
• Focus on two highlighted factors (see paper for rest)
• Control for task effects by focusing on on-task model variants
Impact of Additional Factors
• Compute the gains differentially based on features of models and the queries, e.g.,
• Model entropy, i.e., diversity of the
category (c) model on the machine (m)
−
• Query topic, i.e., top-level ODP
category of the top-result for the query
𝑝 𝑐|𝑚 log 𝑝 𝑐|𝑚
𝑐∈𝐶
• When the machine-based model is more
diverse, then person-based methods perform
better  More benefit from focus
• Topics for which specific users already
represented (only small n interested)
• Others where interests are more broad
Key Contributions
• Introduce attribution-based personalization (ABP) and
estimate its value in ORACLE STUDY (perfect knowledge of who is searching)
• Show machine vs. person is meaningful for an important application:
predicting searchers’ future interests
• Identify properties of interest models and queries for which ABP is best
• Learn model to predict when to apply ABP on a per-query basis
Applying Model and Query Properties
• Train a model to learn when to apply ABP on a per-query basis
• Featurized properties of the model and the query based on additional factors:
Feature Name
Description
MachineModelEntropy Entropy of the interest model constructed from machine activity
RelativeModelSize
Fraction of machine interest model occupied by classified historic clicks
NumberOfSearchers
Number of distinct searchers
QueryClickEntropy
Click entropy for the query
QueryPopularity
computed based on the held-out Bing search log data
QueryTopic
Top-level ODP category of the query
• 130k evaluation queries from 2.5k people (1k machines)
• 6mo/1mo build/test, MART-based classifier, 10 fold CV, 100 runs, Compute F1
• Labels: Positives: ABP > Machine-level, Negatives: ABP  Machine-level
Selective Application of ABP
• Best: 21% ABP, 9% baseline, 70% tied
• Applying prediction in personalization:
• Predict which model best:
• Strong predictive performance (acc.
= 0.918) > marginal baseline (0.791)
Always
apply best
• ABP performance of 88-96% of the oracle
• Much better than always applying ABP
• Top features: MachineModelEntropy
(max), RelativeModelSize (0.699 of
max), QueryTopic (0.441 of max)
• Demonstrates the benefits of intelligently
applying ABP for each query
Discussion
• Shared device searching common
• Oracle study showed clear utility from ABP
• Focused on click prediction; Other applications need to be examined
• Need to performance with automated ABP methods
• Alternative self-identification methods need to be examined (e.g., sign-in)
• Closer link between people and devices  impact on shared device usage?
Summary and Takeaway
• Introduced attribution-based personalization, performed oracle study
• Observe an increased accuracy in future interest predictions (11-19% in the
F1-score, depending on match type) by applying this approach
• Gains vary by model/query properties, with selective application of method
• Significant opportunities to enhance personalization via tailored models
• Future work:
• More (non-oracle) studies with different ABP methods
• ABP methods for truly personalized ranking and recommendation at scale
Shared Device Searching: Distribution
• Distribution of users
searching
• Generally one dominant
searcher (44-83% of
queries)
• Decreases with other
users, but still by far the
most active
+ many other less active
searchers