Project Overview
Download
Report
Transcript Project Overview
“Beyond RFM”
Building the CRS Online Community
February 2005 DMFA Roundtable
TestKevin
#1 Email
Whorton,Campaign
Direct Response Fundraising Consultant
Catholic Relief Services
[email protected]
February 25, 2005
Modeling: Theory and Reality
Theory: RFM Has Weaknesses
Limited use of information: gift history only
Omits demographics, psychographics
Mostly provides decision support for marginal audiences
No prioritization: R<F<M? … M>R=F? … M=R=F?
Uses language of discrete, not continuous variables
Reality: RFM Works Well Enough Most Times
House file mailings—very strong, long histories
House file telemarketing
Could be improved but little incentive to do so:
» Can only be so efficient on mailings
» Beyond some point minimizing cost may minimize revenue
CRS:
Current Practices
Limitations
Future Applications
Applying Techniques at CRS
House File Model Use
• Target Analysis Group: affinity/other gift behavior
Powerful to screen the 50% waste, including lapsed in
acquisition now outperforms a dedicated lapsed campaign
• Genalytics: full-file scoring by half-decile
Full house file, by future probability of giving
Acquisition Model
• Selection criteria used during list selection
Zip models and “Catholic Finder”
• Full acquisition model
Created household database from 45 million past contacts
File scoring after merge purge: typical 20% suppression
Expanding Demographic Data
Distinguishing between donors: marketing vs. DM
• Profiling new donors: 62 years avg vs. “youth movement”
• Drawing linkage between awareness and donation
• Understanding relationship: first gift ongoing behavior
We now use data to categorize donors
• By appeal: emergency, region, program area
• By vehicle: catalog, calendar, newsletter, TM, e• By timing: seasonality
• By preferences: limited mailing, no mail, no TM
Especially critical, post-Tsunami
Data used to drive frequency
• Segmenting beyond RFM, going deeper into files
Often based on Interest Codes (next slide)
Example: Interest Codes
Used for Inclusions/Exclusions
Entire file
• Coded with a mix of
Donor Service &
DM codes
• Simplify our house file
selection
• Behavior captured to:
- simplify ad hoc analysis
- extend RFM
- develop profiles
- crosstab “donor types”
Interest Code Description
Fiscal Year 2003
counts
Interest Code
count ID
FY2003
73479
4
DSF1
66315
7
ED
36337
4
ND
32428
8
Renewal Donors
RD
10334
0
Catalog Overlay
CAO
87914
Premium Donors
PRD
82993
Hispanic Indicator
HISIND
67420
Wooden Bell Donors
WBD
65556
Telemarketing Donors
TD
65489
Score 0-4
0S4
56137
Score 95-99
95S99
55684
Delivery Point
Validation
Emergency Donors
Newer Donors
Other (Non-Modeling) Data
Simulations: gift arrays
• Demographic overlays beyond DM: mid-level PG, MG
Age & wealth trump typical RFM giving behavior
Mail sensitivity analysis
• Finding correlation between total mailings, gifts per donor
Goal: maximize satisfaction without sacrificing revenue
Maintaining "interest codes" library of preferences
Merge-purge with greater control
• Moved internally, staff analyst & FirstLogic software
Conversion analysis
• List life-cycle: tables showing LTV (2-year) by acq. list
• Target Analysis: benchmarking/comparisons
Other Data: Research
Donor research
• Analyzing share of market/share of wallet
• Knowing what else donors give to
Qualitative/focus groups
• Package/teaser/copy testing
• Underlying motivations/drivers/perceptions
Market research
•
•
•
•
Measuring aided/unaided recall, aficionados
Cluster models (segmentation studies)
Positioning studies (branding, relative message)
Competitive intelligence
Limitations: Analyzing Results
Most segmentation build to drive reporting
• Pledgemaker report writer
• Occasional use of Business Objects/SAS for ad hoc
Most segmentation is by discrete RFM buckets
• Segmentation continues in the "normal way"
$25-$49, 0-12 months, F1+ $50-$99, 0-12 months, F1+
$100-$249, 0-12 months, F1+
• Extending universe based on interest codes
• Applying excludes
Record types (PG, Corp, Spanish-language, Religious Orders)
Individual preferences (1, 2, 6, 12x preferred mail schedules)
Mutual omits from overlapping camapigns
Best Intentions: Other Applications
Original goal in 2003: "family of models"
•
•
•
•
Telemarketing
Early warnings of defection
Lapsed donors
Upgrade potential: mid-level program
Reasons for using:
• High cost per contact/good stewardship
• Sensitivity to complaints
Predict positive and negative outcomes
Complaints seen as proxy for reduced lifetime value
Reasons not pursued
• Not a $$ limitation, but rather management time
Goal/Vision
Want to be more "donor focused"
• Finding constructive ways to avoid treating all donors the
same
• RFM often treats as identical:
$500 donor, every year, 1 gift very end of year
$500 cumulative donor, monthly frequency
$500 first-time donor
• Goal: sufficiently flexible systems to tailor contact sequence
Hard to implement CRM systems to reduce costs/maximize
efficiency & donor satisfaction
Sample: Donor-focused Grid
Use the gift they
give to this appeal
Consider lifetime
seasonal giving activity
Sample Analysis: Years on File
• Graphing non-linear relationships: finding “sweet spots”
Analysis: Lifetime Avg. Gift
• And knowing when the relationships really are linear/predictive.
Quick Guide to
Models/Techniques
Guide to Models
• Three major families:
Parametric Methods
• Linear regression, logistic regressions
Recursive Partitioning methods (i.e. CHAID)
• Tree diagrams—easier to see interaction between
variables. Most time consuming.
Non-parametric methods
• Neural networks, genetic/natural selection algorithms
• Artificial intelligence—"learning models" used at CRS
• Results are far more important
Results: more a function of data quality than technique
Source: Target Analysis Group: Jason Robbins, statisticians
Sophisticated Techniques,
Simple Answers
Cross-tabulations
Shows simple relationships between variables, typically
percentages
"Grids" allow easy audience selection, but complex to review
Correlation: relationships between two variables
Regression:
X=f(x,y,z) or Membership=function of dues level, presence of
competition, penetration, service mix
R2 “explains” relationship between one variable and everything
driving it
• Projections and forecast models
• Logistic regressions: “yes/no” predictions
• Logarithmic: coefficients=percentage contribution
• Dummy variables: use to measure seasonality, time trends,
effects of one-time shifts
Introducing Linear Regression
Linear regression defined
• PR=aR+bF+cM+dO
• In English, “predicted revenue is a function of donor’s
recency of giving, frequency, agg value, other stuff"
• Model for a renewal program: with avg response rate
4.25%, avg gift $36.25, revenue/name mailed of $1.54:
Equation
Months
since
last gift
Avg Value
6.5
Coefficient
Contribution to RNM
Total gifts,
relevant
time period
Aggregate total
value of
gifts
Indexed
wealth of
donor
2.4
$156
85
-0.068
0.215
0.00465
0.0087
-$0.44
$0.52
$0.73
$0.74
1.54=-0.068(6.5) + 0.215(2.4) + 0.00465(156) + 0.0087(85)
Confusing, but potential "Holy Grail" tool for your house file program
More Sense from Regressions
Confusing exposition: briefly assume you know what this means!
Alternative functional forms tell you more
For example: logarithmic transformations of each independent
variable (R, F, M, Wealth) put them on equal "dimensions"
Average values will no longer make sense, but coefficients will!
In last equation:
0.182 Months Since
0.215 Total Gifts
0.300 Aggegate Gifts
0.305 Indexed Wealth
Means each value represents percentage contribution to results!!
Note on last slide, many combinations of specific values would add
to the average revenue per donor
» The formula "predicts" it, because it represents the "best fit" expressing
relationship between the dependent and independent variables
This is an overly simple equation: it assumes only RFM plus wealth
» Often there are other hidden values that also influence
» Equation level metrics (R-squared) and variable-level (t ratios) tell you
the degree of prediction and statistical significance
What You Should Know as a User
When these techniques are used …
• Generally statistical software runs these: SAS at CRS
• Fast process: takes less time to run than to explain
• Key: some staff need to understand what the results mean
Younger staff are better, esp. if exposed to it in college—"data kids"
Once a formula is derived, the real output is a scored file
• "Plotting the residuals" means taking best fit, multiplying through
• Output can be indexed/scored according to predicted Rev/M etc.
Predict.
Month
Tot gifts
Tot value
Wealth
Frequent donor, low gift, well-to-do
$3.66
2.5
10
$240
65
Lapsed occasional donor, big wealthy giver
$3.89
13
2
$750
98
Periodic giver, average gift, well-to-do
$1.58
6
4
$120
65
First-time donor, modest means
$0.47
2
1
$32
28
This typically falls on a curve, with an index ranging from 0-99th percentile of predicted
revenue per name mailed
Acquisition Modeling
at CRS
Before:
List Effectiveness
Campaign 1
Campaign 2
• Targeting based on list effectiveness
• Focused on “finding more lists like these”
New Approach
• New analytic system to drive programs
Build prospect universe of likely responders
Overlay with demographic and census data
Catalog interaction over time by person
Develop insights over time with modeling
Select/suppress based on predicted behavior
After: Prospect Behavior
Marketing
History
Census &
Specialty
Demographics
+
List &
Campaign
Attributes
+
• Targeting based on prospect behavior
• Focus on “finding more people like this”
Preparation
• Develop infrastructure
• Collect and organize data
External
Demographics
Data
Focused
Lists
Prospect
Lists
Matchcode and
Geography
Campaign
Data
Prospect
Universe
• Response behavior retained
• Other available information added
Applying Analytics to Discover
Patterns
Structured
Data
Model Ready
Data
Proliferation
of Models
Actionable
Results
Equation
Prospect
Universe
ƒ(x) =
*
Equation
ƒ(x) =
+
+
*
Suppression
List
The Final Solution
Sample
Scoring Equation
Acquisition
Promotions
ƒ(x) =
*
+
Donations
Data Mart
Census
Demographics
Catholic
Demographics
Suppress
Mailing
Universe
Suppressed
Mailing
Universe
To
Mail
Production
Results/Benefits
• Focused models on top segments rather than
entire universe
Suppressed mailing to bottom of prospect universe
Discovered significant numbers of new prospects similar
to existing donors
• Savings more than paid for entire analytics
program by:
Removing bottom portion of prospect universe that
provides negative ROI
Providing greater understanding of and insight into
characteristics of prospects and donors