Data Mining Approaches to Collections and Case Closure. Background

(1)

Data Mining Approaches

to Collections and Case

Closure

Bill Haffey

Technical Director, SPSS Public Sector

Background

• Florida DOR has 500,000 sales accounts, of which

~35,000 are likely to be in the collections process in a given month

• Payment frequencies range from monthly to

annually, based on expected tax amount

• Current collections process generally entails: – Notice mailed after 30 days

– Phone call after another 15 days

– Visit after 54 days, or collection agency for low $ – Garnishment/lien after 120 days

• All accounts treated identically, and no costs have

(2)

Background (cont)

• Idea:

– Identify ‘paths’/maps composed of

minimal/optimal sequences of actions that tend to result in delinquent case closure (for monthly payment accounts), perhaps unique to particular account types

– Deploy these paths into an automated

recommendation engine designed to improve timeliness and efficiency of collections process

Account Type B close close phone phone close notice notice notice notice . Acct3 Acct2 Acct1

Sequence

Detection

(3)

Notice

Notice Phone call Visit/Collections Close Close Close Account Type A Account Type B Account Type C

‘Recommendation’ Engine

If

Then

But, in reality . . .

‘Best Contact’ Resolution not yet feasible:

• Actions made to the account are not separable:

• 1st_{notice sent on establishment of liability} • Phone call after another 15 days

• Sent to service center 54 days after 1st_notice • Garnishment/lien may be made

– What if notice rec’d and payment sent day 40, but not rec’d

by Fla until day 45 – after phone call placed

– A phone call placed to account == a phone call rec’d from

account w/promise to pay

• Not all actions made on the account are recorded: – ‘Virtual agent campaigns’ (eg, Mosaix recorded msg) not

(4)

Instead . . .

• Model time to account closure (X

days), broken into the following

groups:

– X < 30 – 30 < X < 60 – X > 60

• Assumptions:

– X < 30

• Case will have entailed minimal contact

– 30 < X < 60

• Notice and/or phone call or automated message

– X > 60

• (and bill exceeds $250 threshold) handled by field service ctr

Why?

• These time-to-closure groupings

provide a reasonable proxy for the

type of contact that resulted in

closure

• The modeling and prediction of an

account’s time-to-closure could

provide such business rules as:

– If account is predicted as X < 30, consider not adding

case to call queue for an additional period

– If account is predicted as X > 60, refer case directly to

(5)

Why Data Mining?

• Needed to ‘model’/predict the time-to-closure

category

– As opposed to query/OLAP/report ‘snapshots’ • Lots of legacy data to ‘train’ the model (account

characteristics and outcomes)

– Ability to scale procedures against large

volumes of data

• Needed flexibility in types of data that could be

modeled

– As opposed to traditional statistical procedures

Why Data Mining

(cont)?

• In training the model, needed to minimize the

probability of especially ‘bad’ predictions

– Predicting 30 < X < 60 for a case that would

actually close in X < 30 isn’t as ‘bad’ as predicting X > 60 for that same case

• Needed to understand the model – why certain

types of cases were predicted to close at X > 60

– As opposed to an opaque ‘black-box’ modeling

methodology

• Chose the Rule-Induction data mining

(6)

The Data Mining Project followed the CRISP-DM Methodology.

Project Approach Methodology

CRISP-DM Approach

ü Standard, proven process to guide data mining efforts

ü Maximizes return on investment in data mining tools and processes

ü Iterative process that incorporates business expertise and understanding as a key guide to analyses

ü Standard, proven process to guide data mining efforts

ü Maximizes return on investment in data mining tools and processes

ü Iterative process that incorporates business expertise and understanding as

a key guide to analyses

Benefits Provided Predictive Evolution OLAP Query and Report Data Mining and Forecasting Real-Time Information Distribution Time Business Value Predictive/Proactive: “What should we offer this customer today?”

Predictive: “Which ones are at risk of

leaving?”

Historical: “Which cities did they live

in?”

Historical: “How many customers do

we lose each month?”

Cross Industry Standard Process for Data Mining: CRISP – Data

Mining Methodology

ØDeveloped by SPSS, NCR, Daimler-Chrysler, and OHRA in 1996

ØTime tested and used worldwide

ØFlexible and adaptable methodology

ØSix Cyclic Stages:

• Business Understanding • Data Understanding • Data Preparation • Modeling • Evaluation • Deployment

(7)

Step 6: Deployment Step 6: Deployment üImplement models and processes.

•Plan and structure processes for deployment of model.

•Demonstration of models.

CRISP – DM: Project Approach

Project Goal: Develop a data model that will predict the time required for an account to close for both bills and delinquencies.

Project Goal: Develop a data model that will predict the time required for an account to close for both bills and delinquencies.

Step 5: Evaluation

Objectives: üüGoals definitionProject objectives

üGain buy-in Steps: Step 4: Modeling Step 4: Modeling Step 3: Data Preparation Step 3: Data Preparation Step 2: Data Understanding Step 2: Data Understanding Step 1: Business Understanding Step 1: Business Understanding üDetermine status of data üConduct data collection process

üPrepare data for detailed analysis

üDetermine missing data

üModel data to yield cross-sell insights

üValidate process and results with business goals

Activities: •_•Define project goals_{Conduct interviews} with key staff to define analytic and reporting processes

•Assess current processes

•Define success criteria

•Determine deployment method

•Collect data

•Data quality check

•Upload data into Clementine •Select fields to be used in analyses •Clean data •Transform and derive calculated fields as required •Conduct various modeling procedures on data •Identify and implement highest-value modeling method •Model data •Revisit original business objectives •Validate process and results with business goals

•Review results with Client and make any necessary modifications prior to delivery Deliverables: •Interviews •Definition of project goals •Success criteria

•Data audit report •Finished dataset to be used for analysis

•Documented analytical process as performed

•Analytical results tied to business objectives

•Additional input needed for Go/No Go decision

•Data quality improvement recommendations

First Round of Models

• Data Preparation Steps

– Take time to group SIC (first 2 digits) into meaningful

categories

– Create time history for AGE and CASE_AGE

– Do not yet include time histories for other fields, such as

contacts, bankrupt, etc.

• Modeling Steps

– Create decision trees and neural networks using available

fields

– Used balanced samples for training the neural networks – Select models that do the best job

• Predicting outcomes

(8)

Data Sources

COUNTY ACCOUNT APP_PERIOD CREA_DATE1 STAT_DATE1 AGE CREA_DATE2 STAT_DATE2 CONTACTS RECNO CASE_AGE 11 002501 200007 10/10/00 10/27/00 17 10/11/00 10/27/00 1 604516 16 11 002501 200009 11/29/00 12/14/00 15 11/30/00 12/14/00 0 670115 14 11 002501 200010 12/21/00 1/17/01 27 12/22/00 1/17/01 2 747129 26 11 002501 200011 1/24/01 2/14/01 21 1/25/01 2/14/01 2 809634 20 11 002501 200101 4/25/01 5/18/01 23 4/26/01 5/18/01 2 1042649 22 11 002501 200102 4/25/01 5/18/01 23 4/26/01 5/18/01 2 1042650 22 11 002501 200104 6/29/01 7/23/01 24 7/2/01 7/23/01 1 1197620 21 11 003003 199910 2/4/00 3/21/00 46 2/7/00 4/5/00 2 51626 58 11 003003 199912 2/25/00 4/5/00 40 2/7/00 4/5/00 4 88246 58 11 003003 199912 3/7/00 3/31/00 24 2/7/00 4/5/00 2 97479 58 11 003003 200001 4/21/00 5/25/00 34 4/19/00 6/8/00 4 249504 50 11 003003 200002 5/1/00 5/30/00 29 4/19/00 6/8/00 4 257720 50 11 003003 199912 5/11/00 6/8/00 28 4/19/00 6/8/00 1 291136 50 11 003003 200003 7/18/00 8/21/00 34 7/19/00 8/21/00 0 435945 33

Types of Features

• Create Time-Based Features

– AGE features • Last AGE value • Maximum AGE

• Average AGE for all modules, last 3 modules, last 5 modules, etc.

– CASE_AGE features

• Same kinds of features as AGE: last, max, average AGE – Contacts

• Reduce large numbers of categories down to a smaller (more

manageable number)

– Ex: County, ORG_CODE, SIC, KIND_CODE, STAT_CODE – Reason: reduce redundant information, speed up modeling

(9)

Data Preprocessing Stream

SIC 2-Digit Features

• Group SIC 2-digit Values

– Functionally

(SIC 1-digit)

(10)

Age Category Distributions

• Split sample data into training and testing subsets

– Training for creating model

– Testing for assessing model performance

(11)

Template Modeling Stream

• Standard modeling stream

– Load data – Create models

– Assess results for training subset and testing subset

AGE Neural Network Model Parameters and Results

• Sometimes the direct path to a

model doesn’t work well.

• Create a model that predicts

AGE, and use this model as input to the AGE_cat model (actually, created a model that predicted LOG10(AGE)

• Make sure no fields are allowed

in the AGE model that cannot be included in AGE_cat model

(12)

Neural Network Accuracy Predicting Age

• AGE model predicts AGE values with 69% correlation. A scatter plot shows predictions vs. actual AGE values.

• This doesn’t have

to be perfect to provide good information for the AGE_cat models

Rule Induction – Key

Features

• Model output is intuitive – in the

form of either decision trees or

rulesets

• Flexibility in types of data

• Can ‘ransack’ a dataset to identify

key data features

– The resultant model will utilize

(13)

– Decision trees:

•

income < $40K

–job > 5 yrs then good risk –job < 5 yrs then bad risk

•income > $40K

–high debt then bad risk –low debt then good risk

–

or Rule Sets

:

•Rule #1 for good risk:

–if income > $40K –if low debt

–if income < $40K –if job > 5 years

low 7 41k Good 3 . . . . . high 3 60k Bad 2 low 6 50k Good 1 Debt Job Income Risk Cust Training Data

Build the MODEL

low 7 41k Bad 80 . . . . high 3 60k Good 79 low 6 50k Good 78 Debt Job Income Risk Cust Testing Data

–if income > $40K –if low debt

–if income < $40K –if job > 5 years

Model Test the Model

(14)

Bad Amb Good Bad Amb Good

Actual

Outcomes

Modeled

Outcomes

Some Model ‘misses’

more critical than

others . . .

Changing Where the Errors Occur

• Change

misclassification costs to change where errors occur.

• If want to ensure that

one gets category 3 records correct, change how the decision tree views errors on records with category 3.

• In this example,

classifier has 84.8% accuracy on testing data for category 3.

• However, we also get

many category 1 and 2 records incorrectly called category 3 (false alarms)

No misclassification costs

(15)

Decision Tree Accuracy on Testing Data

• Results for output field Age_cat • Comparing $C-Age_cat with Age_cat

– Correct : 20759 ( 60.15%) – Wrong : 13753 ( 39.85%) – Total : 34512 • Coincidence Matrix • $C-Age_cat – 1 2 3 – 1 10296 769 3040 – 2 3665 1892 2463 – 3 3175 641 8571

Actual

Predicted

Key Variables in AGE_cat Decision Tree Model

• Decision tree

rules for best tree. • This is actually the third “boost” from a series of decision trees • AGE_pred is first split

(16)

Some Interesting Rules

• Rule #1 for 3: • if WAR_FLAG == Y • then -> 3 (1019.0, 0.777) • Rule #26 for 3: • if WAR_FLAG == N • and field50 > 3 • and field50 =< 6 • and last_caseage_know > 28 • and last_contacts_know =< 3.977 • and module_count > 11 • and COUNTY =< 54 • then -> 3 (83.0, 0.687) • Rule #27 for 3: • if WAR_FLAG == N • and field50 > 6 • and last_caseage_know > 28 • and last_contacts_know =< 3.977 • and max_known_age =< 44 • and ORG_CODE == [00 01 02 03 06 09 12 13 14 15 16 21 22 25 26 27 28 29 47] • then -> 3 (228.0, 0.684) Rule #1 for 3: if WAR_FLAG == Y then -> 3 (1019.0, 0.777) Rule #6 for 1: if WAR_FLAG == N and field50 =< 1 and last_caseage_know =< 31 and TAX_STATUS == 1 then -> 1 (13184.0, 0.605) Rule #7 for 1: if WAR_FLAG == N and field50 > 0 and field50 =< 1 and last_caseage_know =< 31 and TAX_STATUS == 1 and last_age_know =< 27 and ORG_CODE == [00 01 02 03 06 09 12 14 15 16 17 21 23 25 26 27 28 29 47] then -> 1 (49.0, 0.694) Rule #8 for 1: if WAR_FLAG == N and field50 > 0 and field50 =< 1 and last_caseage_know =< 31 and TAX_STATUS == 1 and last_age_know =< 27 and ORG_CODE == 11 and SIC_2_groups == ['00_41_82_86' '01_15_42_53_84_91' '02_32_67' '07_25_30_48_56_75' '09_38_63_64_93' '10_29_31_34_45' '13' '14_23_78' '16_24_37_49_60' '17_52_54' '20_33_89' '22_43_61' '27_39' '28_50' '35_72_81_99' '36_47_51_55_58_79' '57_59' '65_70' '73_76_80' 100] then -> 1 (332.0, 0.651)

Data Mining Approaches to Collections and Case Closure. Background