Project Overview

Download Report

Transcript Project Overview

“Beyond RFM”
Building the CRS Online Community
February 2005 DMFA Roundtable
TestKevin
#1 Email
Whorton,Campaign
Direct Response Fundraising Consultant
Catholic Relief Services
[email protected]
February 25, 2005
Modeling: Theory and Reality
Theory: RFM Has Weaknesses
Limited use of information: gift history only
Omits demographics, psychographics
Mostly provides decision support for marginal audiences
No prioritization: R<F<M? … M>R=F? … M=R=F?
Uses language of discrete, not continuous variables
Reality: RFM Works Well Enough Most Times
House file mailings—very strong, long histories
House file telemarketing
Could be improved but little incentive to do so:
» Can only be so efficient on mailings
» Beyond some point minimizing cost may minimize revenue
CRS:
Current Practices
Limitations
Future Applications
Applying Techniques at CRS
 House File Model Use
• Target Analysis Group: affinity/other gift behavior
 Powerful to screen the 50% waste, including lapsed in
acquisition now outperforms a dedicated lapsed campaign
• Genalytics: full-file scoring by half-decile
 Full house file, by future probability of giving
 Acquisition Model
• Selection criteria used during list selection
 Zip models and “Catholic Finder”
• Full acquisition model
 Created household database from 45 million past contacts
 File scoring after merge purge: typical 20% suppression
Expanding Demographic Data
 Distinguishing between donors: marketing vs. DM
• Profiling new donors: 62 years avg vs. “youth movement”
• Drawing linkage between awareness and donation
• Understanding relationship: first gift  ongoing behavior
 We now use data to categorize donors
• By appeal: emergency, region, program area
• By vehicle: catalog, calendar, newsletter, TM, e• By timing: seasonality
• By preferences: limited mailing, no mail, no TM
 Especially critical, post-Tsunami
 Data used to drive frequency
• Segmenting beyond RFM, going deeper into files
 Often based on Interest Codes (next slide)
Example: Interest Codes
Used for Inclusions/Exclusions
Entire file
• Coded with a mix of
Donor Service &
DM codes
• Simplify our house file
selection
• Behavior captured to:
- simplify ad hoc analysis
- extend RFM
- develop profiles
- crosstab “donor types”
Interest Code Description
Fiscal Year 2003
counts
Interest Code
count ID
FY2003
73479
4
DSF1
66315
7
ED
36337
4
ND
32428
8
Renewal Donors
RD
10334
0
Catalog Overlay
CAO
87914
Premium Donors
PRD
82993
Hispanic Indicator
HISIND
67420
Wooden Bell Donors
WBD
65556
Telemarketing Donors
TD
65489
Score 0-4
0S4
56137
Score 95-99
95S99
55684
Delivery Point
Validation
Emergency Donors
Newer Donors
Other (Non-Modeling) Data
 Simulations: gift arrays
• Demographic overlays beyond DM: mid-level PG, MG
 Age & wealth trump typical RFM giving behavior
 Mail sensitivity analysis
• Finding correlation between total mailings, gifts per donor
 Goal: maximize satisfaction without sacrificing revenue
 Maintaining "interest codes" library of preferences
 Merge-purge with greater control
• Moved internally, staff analyst & FirstLogic software
 Conversion analysis
• List life-cycle: tables showing LTV (2-year) by acq. list
• Target Analysis: benchmarking/comparisons
Other Data: Research
 Donor research
• Analyzing share of market/share of wallet
• Knowing what else donors give to
 Qualitative/focus groups
• Package/teaser/copy testing
• Underlying motivations/drivers/perceptions
 Market research
•
•
•
•
Measuring aided/unaided recall, aficionados
Cluster models (segmentation studies)
Positioning studies (branding, relative message)
Competitive intelligence
Limitations: Analyzing Results
 Most segmentation build to drive reporting
• Pledgemaker report writer
• Occasional use of Business Objects/SAS for ad hoc
 Most segmentation is by discrete RFM buckets
• Segmentation continues in the "normal way"
$25-$49, 0-12 months, F1+ $50-$99, 0-12 months, F1+
$100-$249, 0-12 months, F1+
• Extending universe based on interest codes
• Applying excludes
 Record types (PG, Corp, Spanish-language, Religious Orders)
 Individual preferences (1, 2, 6, 12x preferred mail schedules)
 Mutual omits from overlapping camapigns
Best Intentions: Other Applications
 Original goal in 2003: "family of models"
•
•
•
•
Telemarketing
Early warnings of defection
Lapsed donors
Upgrade potential: mid-level program
 Reasons for using:
• High cost per contact/good stewardship
• Sensitivity to complaints
Predict positive and negative outcomes
Complaints seen as proxy for reduced lifetime value
Reasons not pursued
• Not a $$ limitation, but rather management time
Goal/Vision
Want to be more "donor focused"
• Finding constructive ways to avoid treating all donors the
same
• RFM often treats as identical:
$500 donor, every year, 1 gift very end of year
$500 cumulative donor, monthly frequency
$500 first-time donor
• Goal: sufficiently flexible systems to tailor contact sequence
 Hard to implement CRM systems to reduce costs/maximize
efficiency & donor satisfaction
Sample: Donor-focused Grid
Use the gift they
give to this appeal
Consider lifetime
seasonal giving activity
Sample Analysis: Years on File
• Graphing non-linear relationships: finding “sweet spots”
Analysis: Lifetime Avg. Gift
• And knowing when the relationships really are linear/predictive.
Quick Guide to
Models/Techniques
Guide to Models
• Three major families:
 Parametric Methods
• Linear regression, logistic regressions
 Recursive Partitioning methods (i.e. CHAID)
• Tree diagrams—easier to see interaction between
variables. Most time consuming.
 Non-parametric methods
• Neural networks, genetic/natural selection algorithms
• Artificial intelligence—"learning models" used at CRS
• Results are far more important
 Results: more a function of data quality than technique
Source: Target Analysis Group: Jason Robbins, statisticians
Sophisticated Techniques,
Simple Answers
Cross-tabulations
 Shows simple relationships between variables, typically
percentages
 "Grids" allow easy audience selection, but complex to review
Correlation: relationships between two variables
Regression:
 X=f(x,y,z) or Membership=function of dues level, presence of
competition, penetration, service mix
 R2 “explains” relationship between one variable and everything
driving it
• Projections and forecast models
• Logistic regressions: “yes/no” predictions
• Logarithmic: coefficients=percentage contribution
• Dummy variables: use to measure seasonality, time trends,
effects of one-time shifts
Introducing Linear Regression
 Linear regression defined
• PR=aR+bF+cM+dO
• In English, “predicted revenue is a function of donor’s
recency of giving, frequency, agg value, other stuff"
• Model for a renewal program: with avg response rate
4.25%, avg gift $36.25, revenue/name mailed of $1.54:
Equation
Months
since
last gift
Avg Value
6.5
Coefficient
Contribution to RNM
Total gifts,
relevant
time period
Aggregate total
value of
gifts
Indexed
wealth of
donor
2.4
$156
85
-0.068
0.215
0.00465
0.0087
-$0.44
$0.52
$0.73
$0.74
1.54=-0.068(6.5) + 0.215(2.4) + 0.00465(156) + 0.0087(85)
Confusing, but potential "Holy Grail" tool for your house file program
More Sense from Regressions
 Confusing exposition: briefly assume you know what this means!
 Alternative functional forms tell you more
 For example: logarithmic transformations of each independent
variable (R, F, M, Wealth) put them on equal "dimensions"
 Average values will no longer make sense, but coefficients will!
 In last equation:
0.182 Months Since
0.215 Total Gifts
0.300 Aggegate Gifts
0.305 Indexed Wealth
Means each value represents percentage contribution to results!!
 Note on last slide, many combinations of specific values would add
to the average revenue per donor
» The formula "predicts" it, because it represents the "best fit" expressing
relationship between the dependent and independent variables
 This is an overly simple equation: it assumes only RFM plus wealth
» Often there are other hidden values that also influence
» Equation level metrics (R-squared) and variable-level (t ratios) tell you
the degree of prediction and statistical significance
What You Should Know as a User
 When these techniques are used …
• Generally statistical software runs these: SAS at CRS
• Fast process: takes less time to run than to explain
• Key: some staff need to understand what the results mean
 Younger staff are better, esp. if exposed to it in college—"data kids"
 Once a formula is derived, the real output is a scored file
• "Plotting the residuals" means taking best fit, multiplying through
• Output can be indexed/scored according to predicted Rev/M etc.
Predict.
Month
Tot gifts
Tot value
Wealth
Frequent donor, low gift, well-to-do
$3.66
2.5
10
$240
65
Lapsed occasional donor, big wealthy giver
$3.89
13
2
$750
98
Periodic giver, average gift, well-to-do
$1.58
6
4
$120
65
First-time donor, modest means
$0.47
2
1
$32
28
This typically falls on a curve, with an index ranging from 0-99th percentile of predicted
revenue per name mailed
Acquisition Modeling
at CRS
Before:
List Effectiveness
Campaign 1
Campaign 2
• Targeting based on list effectiveness
• Focused on “finding more lists like these”
New Approach
• New analytic system to drive programs
 Build prospect universe of likely responders
 Overlay with demographic and census data
 Catalog interaction over time by person
 Develop insights over time with modeling
 Select/suppress based on predicted behavior
After: Prospect Behavior
Marketing
History
Census &
Specialty
Demographics
+
List &
Campaign
Attributes
+
• Targeting based on prospect behavior
• Focus on “finding more people like this”
Preparation
• Develop infrastructure
• Collect and organize data
External
Demographics
Data
Focused
Lists
Prospect
Lists
Matchcode and
Geography
Campaign
Data
Prospect
Universe
• Response behavior retained
• Other available information added
Applying Analytics to Discover
Patterns
Structured
Data
Model Ready
Data
Proliferation
of Models
Actionable
Results
Equation
Prospect
Universe
ƒ(x) =
*
Equation
ƒ(x) =
+
+
*
Suppression
List
The Final Solution
Sample
Scoring Equation
Acquisition
Promotions
ƒ(x) =
*
+
Donations
Data Mart
Census
Demographics
Catholic
Demographics
Suppress
Mailing
Universe
Suppressed
Mailing
Universe
To
Mail
Production
Results/Benefits
• Focused models on top segments rather than
entire universe
 Suppressed mailing to bottom of prospect universe
 Discovered significant numbers of new prospects similar
to existing donors
• Savings more than paid for entire analytics
program by:
 Removing bottom portion of prospect universe that
provides negative ROI
 Providing greater understanding of and insight into
characteristics of prospects and donors