A Basic Guide to Modeling A Basic Guide to Modeling
Techniques for All Direct Techniques for All Direct
Marketing Challenges Marketing Challenges
C. Olivia Rud C. Olivia Rud
Executive Vice President Executive Vice President
Data Square, LLC Data Square, LLC Allison Cornia
Allison Cornia
Database Marketing Manager Database Marketing Manager
Microsoft Corporation Microsoft Corporation
• Types and Uses of Models
– Descriptive
• Segmentation
• Profiling
– Predictive
• Regression
• Trees
• Neural Networks
• Genetic Algorithms
• Association Rules
• Latent Class Variable
• Implementing Models
• Why Good Models Fail
– Scoring errors – Backend failures
Overview
Segmentation Analysis
• Segmentation analysis groups variables with like characteristics.
• Can be market driven: analyst determines the segments.
• Can be data driven: data determines the segments (clustering)
MALE FEMALE
M S D W M S D W
< 40 55 58 54 45 35 52 46 31
40-49 57 60 58 55 40 54 49 37
50-59 61 63 60 58 46 55 51 42
60+ 58 60 61 55 44 46 36 27
Profiling: Credit Card Customers
Low High
Low High
R I S K
Revenue
Best Customer
• Average Balance = $3,288
• Average APR = 13.7%
• Average Tenure = 3.7 Years
• Average Charge-off = $102
• Average Profits = $440
Good Potential
• Average Balance = $549
• Average APR = 8.4%
• Average Tenure = 1.2 Years
• Average Charge-off = $29
• Average Profits = $33
Cautious Potential
• Average Balance = $5,315
• Average APR = 15.8%
• Average Tenure = 2.8 Years
• Average Charge-off = $584
• Average Profits = $239
Low Potential
• Average Balance = $1,089
• Average APR = 12.3%
• Average Tenure = 2.4 Years
• Average Charge-off = $111
• Average Profits = $8
Profiling: Credit Card Customers
Low High
Low High
R I S K
Revenue
Best Customer
• Best Customer Service
• No Annual Fee
• Automatic Line Increase
• Offer Cardholder Benefits
Good Potential
• Good Customer Service
• Decrease APR
• Offer Balance Transfers
• Offer Cardholder Benefits
Cautious Potential
• Charge Annual Fee
• Increase APR
• Monitor Payment Behavior
• Offer secured loan
Low Potential
• Charge Annual Fee
• Increase APR
• Low Priority Service
• No Solicitations
Association Rules
• Rules derived from past
behavior such as movement on Website or purchase
groupings.
• Used to enhance Website structure and modify Web traffic.
• Used to make ‘real time’
targeted offers.
Linear Regression
• Uses continuous values to predict continuous value.
• Explains variation in data using ordinary least squares (OLS).
• Useful in predicting:
– amount of sale ~ advertising, cost, demographics – charge-off dollars ~ balance, financial risk profile,
demographics
– amount of claim ~ age, health risk profile, geography
– dollar balance ~ financial risk profile, action to account, market pressure
– average profitability ~ financial risk profile, price sensitivity, demographics
Simple Linear Regression
*
* *
*
*
*
*
*
* *
*
*
*
* *
ADVERTISING S
A L E S
0 0
$1K
$100
$2K
$3K
$4K
$5K
$200
$6K
$300 $400 $500 $600 Sales
$1,503
$1,755
$2,971
$1,682
$3,497
$1,998
$4,528
$2,937
$3,622
$4,402
$3,844
$4,470
$5,492
$4,398 Advertising
$120
$160
$205
$210
$225
$230
$290
$315
$375
$390
$440
$475
$490
$550
Simple Linear Regression
*
* *
*
*
*
*
*
* *
*
*
*
* *
ADVERTISING S
A L E S
0 0
$1K
$100
$2K
$3K
$4K
$5K
$200
$6K
$300 $400 $500 $600
Minimize Squared Error
• Goal: characterize relationship between advertising and sales
• Result: equation that predicts sales dollars based on advertising dollars spent
Sales = B0 + B1*Advertising
Multiple Linear Regression
• Minimizes squared error in N-dimensional space
• Credit card balances – payment amount – years
– gender (0/1)
*
*
* *
**
*
*
*
*
*
* *
*
*
Balances = 2.1774 +.0966*Payment + 1.2494*Months + .4412*Gender
Logistic Regression
• Uses continuous values to predict probability of discrete outcome
• Iterative method of minimizing error using method of maximum likelihood
• Useful in predicting probability of:
– response to loan offer ~ financial risk profile, demographics – response to insurance offer ~ health risk profile,
demographics
– activation ~ financial risk profile, demographics, market pressure
– charge-off ~ balance, financial risk profile, demographics – claim ~ health risk profile, demographics
– fraud ~ financial risk profile, account activity
– account closure ~ account activity, market pressure
Logistic Regression
log(p/(1-p)) = B0 + B1X1+ b2X2+…BnXn P = 1/(1+e-(B0+ B1X1+ b2X2+…BnXn))
* * * * * * * * * ** *** * ** * **** **
0
• Predicts probability of event 1
occurring using function of linear predictors
• p = probability of event occurring
• p/(1-p) is the odds of an event occurring.
• Log of the odds:
log(p/(1-p)) is linear function of
predictors. Uses s-shaped curve instead of linear function to fit the data.
**** ** ** *** * * * * * * *
Classification Trees
Mailed 10,000 Resp Rate 2.6%
Male 4,677
Resp Rate 3.2% Female 5,323
Resp Rate 2.1%
<$30K 1,290 Resp Rate 1.7%
$30K-$45K 2,106 Resp Rate 3.6%
>$45K 1,281 Resp Rate 4.1%
Age < 40 3,112 Resp Rate 0.7%
Age => 40 2,211 Resp Rate 4.3%
Decision Trees
Issue Loan
No Yes
$0 x $728 (Interest)
x $4872 (Loss) 97%
3%
Profit
($146)
$706
Decision Node Chance Node
Allows you to quantify the best action.
Neural Networks
– amount of sale ~ advertising, cost, demographics – charge-off dollars ~ balance, financial risk profile,
demographics
– amount of claim ~ age, health risk profile, geography
– dollar balance ~ financial risk profile, action to account, market pressure
– average profitability ~ financial risk profile, price sensitivity, demographics
Artificial Neural Networks
• Multiple hidden nodes
• Each node is linear transformation of output from previous node
• Structure is too complex to interpret weights.
Hidden layer
Output layer
• Stopping rules
–Error threshold –Time limit
–Change in error
Artificial Neural Networks
• Advantages
– Handles non-linearity – Handles interactions
– Considered very accurate
– Useful for complex optimization
• Disadvantages
– Not interpretable – CPU intensive
– Poor handling of missing data
– Sensitive to input variable selection – Explodes categorical data
– Risk of over-fitting -> not robust
Genetic Algorithms
• Based on Darwin’s Principle of “Survival of the Fittest”.
• Genetic Operators
– Reproduction (Copying) – Mating (Crossover)
– Mutation (Altering)
• Process starts with initial population of random models.
– Models with poor performance (fitness) “die out” - are deleted.
Genetic Algorithms
Methodology
• Fitness of the new population improves by:
1. Copying good models.
2. Mating good models to create better “offspring”
models with improved fitness.
3. Altering good models to create “mutants” with improved fitness.
4. Repeat steps 1 - 3 until stopping rules are met.
• The Best Evolved model is the solution.
Genetic Algorithms
• Models are composed of – Functions
• arithmetic (+ , - , * , ÷ )
• mathematical (log, exp, max, ... )
• trigonometric (sin, cos, tan, arcsin, ...)
• logics (and, or, not, gt, lt, eq, ...)
• conditional (if-then-else) – Variables
• independent variables
• numeric values (constants, random numbers)
GA’s - Initialize Random Model
• Models Objective – Predict response
• Let the function set consist of + , - , * , ÷ , exp
• Let the variable set consist of X1, X2, b
20%
20%
20%
20%
20%
+ _
*
÷
exp
GA’s - Initialize Random Model
• Models are displayed in trees.
12.5%
b _
*
÷ exp
Response
+
12.5%
12.5%
12.5% 12.5%
12.5%
12.5%
12.5%
X1 X2
b X1 +
Repeat M times
GA’s – Generate M Models
Response
*
X1 X2
Response
_
b exp
Y = X1*X2 X1
Y = b – exp(X1)
GA’s – Compare Fitness
Fitness Value
(r-square) PTF
Model 1 0.61 0.29
Model 2 0.55 0.26
Model 3 0.48 0.23
Model 4 0.36 0.17
Model 5 0.12 0.06
Total 2.12 1.00
Model 1 Y = x1*(b + X2) Model 2 Y = b - exp(x1) Model 3 Y = x1 - X2
Model 4 Y = x1*x2 Model 5 Y = b + x1
M1 M2
M3
M4
M5 6%
17%
29%
26%
23%
Genetic Methodology
• Fitness Improves by:
– Copying models based on PTF – Mating models based on PTF – Altering models based on PTF
– Continue above until stopping rules are met
• The best-evolved model is the solution
Latent Class Models
• Used more in academic circles
• Software only allowed small sets and a small number of variables
• LatentGOLD developed by Statistical
Innovations (Jay Magidson, inventor of CHAID)
• Scalable sofware
• Disparate sources of data
3 Kinds of Latent Class Models
• Traditional
– Applications in scaling and classification
• Factor
– Applications in exploratory and confirmatory factor analysis
• Regression
– Uses are in the prediction and explanation
when the population is not homogenous
Traditional LCM vs. LC Factor
• Traditional Latent Class Models identify
classes which group together persons who share similar
interest/values/characteristics/behavior
• Latent Class Factor Models identify factors
which group together variables sharing a
common source of variation
Implementing Models
• How do we select based on model results?
• What is the impact to the bottom line?
Gains Table
Number Accounts Predicted Actual Cum Actual Lift Cum Lift 1
48,342 4,891 10.35% 10.12% 10.12% 3.57 3.57 2
48,343 3,945 8.44% 8.16% 9.14% 2.88 3.22 3
48,342 2,783 5.32% 5.76% 8.01% 2.03 2.83 4
48,342 1,151 2.16% 2.38% 6.60% 0.84 2.33 5
48,343 519 1.03% 1.07% 5.50% 0.38 1.94 6
48,342 269 0.48% 0.56% 4.67% 0.20 1.65 7
48,342 112 0.31% 0.23% 4.04% 0.08 1.43 8
48,343 25 0.06% 0.05% 3.54% 0.02 1.25 9
48,342 5 0.01% 0.01% 3.15% 0.00 1.11 10
48,342 1 0.00% 0.00% 2.83% 0.00 1.00
Gains Chart
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Percent Mailed P
e r c e n t
A c t i v e
Modeling Lifetime Value
• Predict probability of activation for a life insurance offer using logistic regression, neural networks, genetic algorithms.
• Use probability to calculate Lifetime Value (LTV) for life insurance prospect for a five year period
• LTV = Pr(Activation)* Risk * (Product Profitability+ Cross Sales)*Lapse Indicator - Marketing Expense
– Activation - probability given by a model
– Risk - indices in matrix of gender * marital status * age group
– Product Profitability - present value of product specific 5 year profit measure
– Cross Sales – additional net revenues for five years following activation – Lapse Indicator – adjustment based on payment method
– Marketing Expense - cost of package, postage & processing
Lifetime Value
Active Cross Risk Lapse Product Average Average Sum
Number Rate Sell Index Indicator Profitability LTV CUM LTV Cum LTV
1 96,685 10.36% $120 0.94 0.99 $553 $64.76 $64.76 $6,261,266
2 96,685 8.63% $104 0.99 1.00 $553 $55.35 $60.06 $11,612,984
3 96,685 5.03% $105 0.96 0.99 $553 $30.99 $50.37 $14,609,591
4 96,685 1.94% $107 0.93 0.97 $553 $11.13 $40.56 $15,685,475
5 96,685 0.96% $98 1.01 0.99 $553 $5.53 $33.55 $16,220,346
6 96,685 0.28% $101 1.02 1.00 $553 $1.09 $28.14 $16,325,522
7 96,685 0.11% $97 1.03 1.01 $553 ($0.04) $24.12 $16,321,311
8 96,685 0.08% $98 0.99 1.01 $553 ($0.26) $21.07 $16,295,747
9 96,685 0.01% $94 1.04 1.02 $553 ($0.75) $18.64 $16,223,586
10 96,685 0.00% $95 1.09 1.02 $553 ($0.78) $16.70 $16,148,199
How many deciles do you mail?
LTV = Pr(Active)* Risk * (Cross Sell Profit + Product Profitability)* Lapse Indicator Index - Marketing Expense
Why Good Models Fail
(Allison’s Top Ten for Troubleshooting)
1. Check the phones; make sure the site is functioning properly
2. Track the mail
3. Listen in on the call center 4. Implementation Issues
• Programming errors
• Inverted scoring
5. Did they pull the right group?
More of Why Good Models Fail
6. Practice crop rotation 7. External validity
8. Internal validity
9. Bad ingredients make for bad models
10. Old models, like old horses, have to be put
out to pasture
“All models are wrong, but some are useful.”
George Box
C. Olivia Rud
Executive Vice President DataSquare, LLC
733 Summer St.
Stamford, CT 06901 203 964-9733 x103
Specializing in Data Mining, Statistical Modeling and Marketing Strategy for Marketing, Risk and Customer Relationship
Management
Allison Cornia
Database Marketing Manager CRM/Home & Retail Division Microsoft Corporation
One Microsoft Way Redmond, WA 98052 425-882-8080
A Basic Guide to Modeling A Basic Guide to Modeling
Techniques for all Direct Techniques for all Direct
Marketing Challenges Marketing Challenges
C. Olivia Rud C. Olivia Rud
Executive Vice President Executive Vice President
Data Square, LLC Data Square, LLC Allison Cornia
Allison Cornia
Database Marketing Manager Database Marketing Manager
Microsoft Corporation Microsoft Corporation