• No results found

A Basic Guide to Modeling Techniques for All Direct Marketing Challenges

N/A
N/A
Protected

Academic year: 2021

Share "A Basic Guide to Modeling Techniques for All Direct Marketing Challenges"

Copied!
39
0
0

Loading.... (view fulltext now)

Full text

(1)

A Basic Guide to Modeling A Basic Guide to Modeling

Techniques for All Direct Techniques for All Direct

Marketing Challenges Marketing Challenges

C. Olivia Rud C. Olivia Rud

Executive Vice President Executive Vice President

Data Square, LLC Data Square, LLC Allison Cornia

Allison Cornia

Database Marketing Manager Database Marketing Manager

Microsoft Corporation Microsoft Corporation

(2)

• Types and Uses of Models

– Descriptive

• Segmentation

• Profiling

– Predictive

• Regression

• Trees

• Neural Networks

• Genetic Algorithms

• Association Rules

• Latent Class Variable

• Implementing Models

• Why Good Models Fail

– Scoring errors – Backend failures

Overview

(3)

Segmentation Analysis

Segmentation analysis groups variables with like characteristics.

Can be market driven: analyst determines the segments.

Can be data driven: data determines the segments (clustering)

MALE FEMALE

M S D W M S D W

< 40 55 58 54 45 35 52 46 31

40-49 57 60 58 55 40 54 49 37

50-59 61 63 60 58 46 55 51 42

60+ 58 60 61 55 44 46 36 27

(4)

Profiling: Credit Card Customers

Low High

Low High

R I S K

Revenue

Best Customer

• Average Balance = $3,288

• Average APR = 13.7%

• Average Tenure = 3.7 Years

• Average Charge-off = $102

• Average Profits = $440

Good Potential

• Average Balance = $549

• Average APR = 8.4%

• Average Tenure = 1.2 Years

• Average Charge-off = $29

• Average Profits = $33

Cautious Potential

• Average Balance = $5,315

• Average APR = 15.8%

• Average Tenure = 2.8 Years

• Average Charge-off = $584

• Average Profits = $239

Low Potential

• Average Balance = $1,089

• Average APR = 12.3%

• Average Tenure = 2.4 Years

• Average Charge-off = $111

• Average Profits = $8

(5)

Profiling: Credit Card Customers

Low High

Low High

R I S K

Revenue

Best Customer

• Best Customer Service

• No Annual Fee

• Automatic Line Increase

• Offer Cardholder Benefits

Good Potential

• Good Customer Service

• Decrease APR

• Offer Balance Transfers

• Offer Cardholder Benefits

Cautious Potential

• Charge Annual Fee

• Increase APR

• Monitor Payment Behavior

• Offer secured loan

Low Potential

• Charge Annual Fee

• Increase APR

• Low Priority Service

• No Solicitations

(6)

Association Rules

• Rules derived from past

behavior such as movement on Website or purchase

groupings.

• Used to enhance Website structure and modify Web traffic.

• Used to make ‘real time’

targeted offers.

(7)

Linear Regression

• Uses continuous values to predict continuous value.

• Explains variation in data using ordinary least squares (OLS).

• Useful in predicting:

– amount of sale ~ advertising, cost, demographics – charge-off dollars ~ balance, financial risk profile,

demographics

– amount of claim ~ age, health risk profile, geography

– dollar balance ~ financial risk profile, action to account, market pressure

– average profitability ~ financial risk profile, price sensitivity, demographics

(8)

Simple Linear Regression

*

* *

*

*

*

*

*

* *

*

*

*

* *

ADVERTISING S

A L E S

0 0

$1K

$100

$2K

$3K

$4K

$5K

$200

$6K

$300 $400 $500 $600 Sales

$1,503

$1,755

$2,971

$1,682

$3,497

$1,998

$4,528

$2,937

$3,622

$4,402

$3,844

$4,470

$5,492

$4,398 Advertising

$120

$160

$205

$210

$225

$230

$290

$315

$375

$390

$440

$475

$490

$550

(9)

Simple Linear Regression

*

* *

*

*

*

*

*

* *

*

*

*

* *

ADVERTISING S

A L E S

0 0

$1K

$100

$2K

$3K

$4K

$5K

$200

$6K

$300 $400 $500 $600

Minimize Squared Error

• Goal: characterize relationship between advertising and sales

• Result: equation that predicts sales dollars based on advertising dollars spent

Sales = B0 + B1*Advertising

(10)

Multiple Linear Regression

• Minimizes squared error in N-dimensional space

• Credit card balances – payment amount – years

– gender (0/1)

*

*

* *

**

*

*

*

*

*

* *

*

*

Balances = 2.1774 +.0966*Payment + 1.2494*Months + .4412*Gender

(11)

Logistic Regression

• Uses continuous values to predict probability of discrete outcome

• Iterative method of minimizing error using method of maximum likelihood

• Useful in predicting probability of:

– response to loan offer ~ financial risk profile, demographics – response to insurance offer ~ health risk profile,

demographics

– activation ~ financial risk profile, demographics, market pressure

– charge-off ~ balance, financial risk profile, demographics – claim ~ health risk profile, demographics

– fraud ~ financial risk profile, account activity

– account closure ~ account activity, market pressure

(12)

Logistic Regression

log(p/(1-p)) = B0 + B1X1+ b2X2+…BnXn P = 1/(1+e-(B0+ B1X1+ b2X2+…BnXn))

* * * * * * * * * ** *** * ** * **** **

0

Predicts probability of event 1

occurring using function of linear predictors

p = probability of event occurring

p/(1-p) is the odds of an event occurring.

Log of the odds:

log(p/(1-p)) is linear function of

predictors. Uses s-shaped curve instead of linear function to fit the data.

**** ** ** *** * * * * * * *

(13)

Classification Trees

Mailed 10,000 Resp Rate 2.6%

Male 4,677

Resp Rate 3.2% Female 5,323

Resp Rate 2.1%

<$30K 1,290 Resp Rate 1.7%

$30K-$45K 2,106 Resp Rate 3.6%

>$45K 1,281 Resp Rate 4.1%

Age < 40 3,112 Resp Rate 0.7%

Age => 40 2,211 Resp Rate 4.3%

(14)

Decision Trees

Issue Loan

No Yes

$0 x $728 (Interest)

x $4872 (Loss) 97%

3%

Profit

($146)

$706

Decision Node Chance Node

Allows you to quantify the best action.

(15)

Neural Networks

– amount of sale ~ advertising, cost, demographics – charge-off dollars ~ balance, financial risk profile,

demographics

– amount of claim ~ age, health risk profile, geography

– dollar balance ~ financial risk profile, action to account, market pressure

– average profitability ~ financial risk profile, price sensitivity, demographics

(16)

Artificial Neural Networks

Multiple hidden nodes

Each node is linear transformation of output from previous node

Structure is too complex to interpret weights.

Hidden layer

Output layer

• Stopping rules

–Error threshold –Time limit

–Change in error

(17)

Artificial Neural Networks

Advantages

– Handles non-linearity – Handles interactions

– Considered very accurate

– Useful for complex optimization

Disadvantages

– Not interpretable – CPU intensive

– Poor handling of missing data

– Sensitive to input variable selection – Explodes categorical data

– Risk of over-fitting -> not robust

(18)

Genetic Algorithms

Based on Darwin’s Principle of “Survival of the Fittest”.

Genetic Operators

– Reproduction (Copying) – Mating (Crossover)

– Mutation (Altering)

Process starts with initial population of random models.

– Models with poor performance (fitness) “die out” - are deleted.

(19)

Genetic Algorithms

Methodology

Fitness of the new population improves by:

1. Copying good models.

2. Mating good models to create better “offspring”

models with improved fitness.

3. Altering good models to create “mutants” with improved fitness.

4. Repeat steps 1 - 3 until stopping rules are met.

The Best Evolved model is the solution.

(20)

Genetic Algorithms

Models are composed of – Functions

• arithmetic (+ , - , * , ÷ )

• mathematical (log, exp, max, ... )

• trigonometric (sin, cos, tan, arcsin, ...)

• logics (and, or, not, gt, lt, eq, ...)

• conditional (if-then-else) – Variables

• independent variables

• numeric values (constants, random numbers)

(21)

GA’s - Initialize Random Model

• Models Objective – Predict response

• Let the function set consist of + , - , * , ÷ , exp

• Let the variable set consist of X1, X2, b

20%

20%

20%

20%

20%

+ _

*

÷

exp

(22)

GA’s - Initialize Random Model

• Models are displayed in trees.

12.5%

b _

*

÷ exp

Response

+

12.5%

12.5%

12.5% 12.5%

12.5%

12.5%

12.5%

X1 X2

b X1 +

Repeat M times

(23)

GA’s – Generate M Models

Response

*

X1 X2

Response

_

b exp

Y = X1*X2 X1

Y = b – exp(X1)

(24)

GA’s – Compare Fitness

Fitness Value

(r-square) PTF

Model 1 0.61 0.29

Model 2 0.55 0.26

Model 3 0.48 0.23

Model 4 0.36 0.17

Model 5 0.12 0.06

Total 2.12 1.00

Model 1 Y = x1*(b + X2) Model 2 Y = b - exp(x1) Model 3 Y = x1 - X2

Model 4 Y = x1*x2 Model 5 Y = b + x1

M1 M2

M3

M4

M5 6%

17%

29%

26%

23%

(25)

Genetic Methodology

• Fitness Improves by:

– Copying models based on PTF – Mating models based on PTF – Altering models based on PTF

– Continue above until stopping rules are met

• The best-evolved model is the solution

(26)

Latent Class Models

• Used more in academic circles

• Software only allowed small sets and a small number of variables

• LatentGOLD developed by Statistical

Innovations (Jay Magidson, inventor of CHAID)

• Scalable sofware

• Disparate sources of data

(27)

3 Kinds of Latent Class Models

• Traditional

– Applications in scaling and classification

• Factor

– Applications in exploratory and confirmatory factor analysis

• Regression

– Uses are in the prediction and explanation

when the population is not homogenous

(28)

Traditional LCM vs. LC Factor

• Traditional Latent Class Models identify

classes which group together persons who share similar

interest/values/characteristics/behavior

• Latent Class Factor Models identify factors

which group together variables sharing a

common source of variation

(29)

Implementing Models

• How do we select based on model results?

• What is the impact to the bottom line?

(30)

Gains Table

Number Accounts Predicted Actual Cum Actual Lift Cum Lift 1

48,342 4,891 10.35% 10.12% 10.12% 3.57 3.57 2

48,343 3,945 8.44% 8.16% 9.14% 2.88 3.22 3

48,342 2,783 5.32% 5.76% 8.01% 2.03 2.83 4

48,342 1,151 2.16% 2.38% 6.60% 0.84 2.33 5

48,343 519 1.03% 1.07% 5.50% 0.38 1.94 6

48,342 269 0.48% 0.56% 4.67% 0.20 1.65 7

48,342 112 0.31% 0.23% 4.04% 0.08 1.43 8

48,343 25 0.06% 0.05% 3.54% 0.02 1.25 9

48,342 5 0.01% 0.01% 3.15% 0.00 1.11 10

48,342 1 0.00% 0.00% 2.83% 0.00 1.00

(31)

Gains Chart

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Percent Mailed P

e r c e n t

A c t i v e

(32)

Modeling Lifetime Value

Predict probability of activation for a life insurance offer using logistic regression, neural networks, genetic algorithms.

Use probability to calculate Lifetime Value (LTV) for life insurance prospect for a five year period

LTV = Pr(Activation)* Risk * (Product Profitability+ Cross Sales)*Lapse Indicator - Marketing Expense

– Activation - probability given by a model

– Risk - indices in matrix of gender * marital status * age group

– Product Profitability - present value of product specific 5 year profit measure

– Cross Sales – additional net revenues for five years following activation – Lapse Indicator – adjustment based on payment method

– Marketing Expense - cost of package, postage & processing

(33)

Lifetime Value

Active Cross Risk Lapse Product Average Average Sum

Number Rate Sell Index Indicator Profitability LTV CUM LTV Cum LTV

1 96,685 10.36% $120 0.94 0.99 $553 $64.76 $64.76 $6,261,266

2 96,685 8.63% $104 0.99 1.00 $553 $55.35 $60.06 $11,612,984

3 96,685 5.03% $105 0.96 0.99 $553 $30.99 $50.37 $14,609,591

4 96,685 1.94% $107 0.93 0.97 $553 $11.13 $40.56 $15,685,475

5 96,685 0.96% $98 1.01 0.99 $553 $5.53 $33.55 $16,220,346

6 96,685 0.28% $101 1.02 1.00 $553 $1.09 $28.14 $16,325,522

7 96,685 0.11% $97 1.03 1.01 $553 ($0.04) $24.12 $16,321,311

8 96,685 0.08% $98 0.99 1.01 $553 ($0.26) $21.07 $16,295,747

9 96,685 0.01% $94 1.04 1.02 $553 ($0.75) $18.64 $16,223,586

10 96,685 0.00% $95 1.09 1.02 $553 ($0.78) $16.70 $16,148,199

How many deciles do you mail?

LTV = Pr(Active)* Risk * (Cross Sell Profit + Product Profitability)* Lapse Indicator Index - Marketing Expense

(34)

Why Good Models Fail

(Allison’s Top Ten for Troubleshooting)

1. Check the phones; make sure the site is functioning properly

2. Track the mail

3. Listen in on the call center 4. Implementation Issues

• Programming errors

• Inverted scoring

5. Did they pull the right group?

(35)

More of Why Good Models Fail

6. Practice crop rotation 7. External validity

8. Internal validity

9. Bad ingredients make for bad models

10. Old models, like old horses, have to be put

out to pasture

(36)

“All models are wrong, but some are useful.”

George Box

(37)

C. Olivia Rud

Executive Vice President DataSquare, LLC

733 Summer St.

Stamford, CT 06901 203 964-9733 x103

[email protected]

Specializing in Data Mining, Statistical Modeling and Marketing Strategy for Marketing, Risk and Customer Relationship

Management

(38)

Allison Cornia

Database Marketing Manager CRM/Home & Retail Division Microsoft Corporation

One Microsoft Way Redmond, WA 98052 425-882-8080

[email protected]

(39)

A Basic Guide to Modeling A Basic Guide to Modeling

Techniques for all Direct Techniques for all Direct

Marketing Challenges Marketing Challenges

C. Olivia Rud C. Olivia Rud

Executive Vice President Executive Vice President

Data Square, LLC Data Square, LLC Allison Cornia

Allison Cornia

Database Marketing Manager Database Marketing Manager

Microsoft Corporation Microsoft Corporation

References

Related documents

This quality measure reports the percentage of patients or residents who experience one or more falls with major injury (e.g., bone fractures, joint dislocations, closed head

[r]

Gravemeijer, 2013; Part B - chapter 8), whose research can be categorized as a development study type of design research as it aimed at developing a high quality geometry course

The result of the study revealed that participation in Micro finance credit service have had a positive and significant impact on the livelihood indicator variables such as

The ρ31 (correlation between using mobile for marketing and health services) are negative and statistically significant at the 10% probability level, indicating

We regard Navigators' financial risk profile as very strong, driven by very strong capital and earnings, intermediate risk position, and adequate financial flexibility.. Capital

• The ratings reflect our view of the group's strong business risk profile and very strong financial risk profile, built on a strong competitive position and very strong capital

During events when water quality and human health are in question, it may be necessary to issue a health advisory that gives advice or recommendations to water system customers on