Session 60 PD, Predictive Modeling Real Applications in Life Insurance and Annuities. Moderator: Ricardo Trachtman, FSA, MAAA

55  Download (0)

Full text

(1)

Session 60 PD, Predictive Modeling – Real Applications in Life Insurance and Annuities Moderator:

Ricardo Trachtman, FSA, MAAA

Presenters:

JJ Lane Carroll, FSA, MAAA Allen M. Klein, FSA, MAAA Scott Anthony Rushing, FSA, MAAA

(2)

Predictive Analytics

and

Life Underwriting

Al Klein May 5, 2015

SOA Life and Annuity Symposium

Session 60: Predictive Modeling – Real Applications in Life Insurance and Annuities

(3)

Agenda

Definitions

– Big data – Predictive analytics

Predictive analytics

– Process

– Why you should use it

Milliman example of approach to underwriting

using both traditional and non-traditional data

The future

(4)

Definition – Big Data

Big data is like xxxxxxx xxx:

 “Everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone

claims they are doing it.”

 Dan Ariely

 Big data is what can be used with predictive analytics to

better analyze data and make decisions

(5)

Definition – Predictive Analytics

 Process in which current or historical data or information

are used to predict future events or behaviors

 We have been doing this for years in life insurance:

– Underwriting assessment

– Preferred underwriting criteria – Expected mortality assumptions

 What’s new?

– More sophisticated modeling techniques and capabilities

(6)

The Big Picture: Big Data Analytics in

Financial Services

 May 2014 report by LIMRA from online survey of 44

companies

 Some of findings:

– 9 of 10 life insurance companies reported using big data analytics – One-third have had programs in place for more than 10 years

– Most companies have fewer than 10 people dedicated to their big data analytics program

– Implementation hurdles include:

• Funding

• Executive buy-in • Legacy systems • Staffing

(7)

Predictive Analytics Process

 Statistical / predictive models used, several examples:

– Classification and Regression Trees (CART) – Sorts data/populations into smaller branches/nodes, used to predict a response

– Cox Proportional Hazard – Estimate of the relative value/risk – Generalized Linear Model (GLM)

• Expands on linear regression model – variable constant to observed values • Allows for a better understanding of the ways multiple variables interact in a

non-linear way and that may not be obvious

– Neural Networks – Model/function which uses interconnected “neurons” to compute values from a large number of inputs

– Regression splines

• A function connected piecewise through polynomial functions to create smoothness where the polynomial pieces connect

• Main purpose is to predict an outcome variable from a set of independent or predictor variables

(8)

Why use Predictive Analytics?

More sophisticated analysis usually provides better

information and solutions

Better likelihood of optimizing desired outcome

May find new solutions or opportunities

Helps to find new and/or better customers

Helps to detect fraud

Other industries are using it

What are the downsides for life insurance purposes?

– Results need to be explainable

(9)

Are buying habits real?

From a well known underwriter, “based on my

purchasing habits and lifestyle, I may be:”

– Over age 70

– Socio-economically challenged – Weight “challenged”

– Addicted to chocolate – Drinking too much

– Smoking too much

(10)

Uses of Predictive Analytics

in Life Insurance

Selection of agents

Lead generation

Underwriting

Product development

Policyholder retention

Detection of fraud at claims time

(11)

What Data is available today?

 Insurance data – Gender – Age – Height/weight – Geographic location – Medical history – Financial information – Lifestyle information – Driving record  Consumer data

– Thousands of records on every individual

– While geographic data exists and is predictive, desire is generally to use individualized consumer data

(12)

Milliman Example

 A client asked us to develop a model, using consumer data

to determine who would qualify for the best preferred class

 Goal was to be able to waive paramedical exam

 A secondary task was to determine who would be most likely

to be declined

 Reasons for this request:

– Reduce underwriting costs by eliminating need for medical exams, MVR, and other tests in some cases

– Reduce issue time

– Improve customer satisfaction

(13)

Milliman Example (cont’d)

 Worked closely with client, who provided us data (about

70,000 lives)

 Some data used to develop a model and rest saved to later

validate the model

 Used a machine-learning program

– Finds non-linear behavior and interactions that a generalized linear model (GLM) cannot

– Recognizes variables that have strong fit – Decision trees used

– At each node, many factors are determined and those that are the strongest drivers are used to split the policies at that node

(14)

Milliman Example (cont’d)

 Example of a node split

All Policies: 10,000

Probability of Best Preferred: 35.0%

BMI = 29 or more: 2,162 policies Probability of Best Preferred: 6.6%

BMI = < 29: 7,838 policies Probability of Best Preferred: 42.9%

(15)

Milliman Example (cont’d)

 Ultimate goal was to develop a model that produced a score

for each applicant

 Score was used to determine if the applicant could be issued

a policy without further underwriting

 Considered both traditional and non-traditional variables

 Examples of traditional variables

– Age – BMI – Gender – MIB – MVR – Rx histories

(16)

Milliman Example (cont’d)

 Examples of non-traditional variables

– Prevalence of banking – Prevalence of exercise – Home assessed value – Household income – Net worth

– Propensity to buy brand-name medicine – Prevalence of shopping

– Travel

 These variables were among 350 fields of data considered, which were culled from the thousands of pieces of data available

 Note that some of the variables were created from multiple pieces of data

 Some can move the scoring in either direction, depending on the circumstances (e.g., shopping)

(17)

Milliman Example – Scoring Process

 Want to determine whether the policy can be rapidly issued

based on the score (without further underwriting)

 Need to establish cut-off points for bucketing the lives into

each of the underwriting classes (e.g., best preferred, preferred, standard, decline)

 Limits were set to reduce the number of cases where the

applicant received a lower rating than from the normal best preferred underwriting risk class – We used two thresholds:

– No more than 20% of applicants one class below class being studied could score above the threshold being tested

– No more than 10% of applicants more than one class below class being studied could score above threshold being tested

(18)

Milliman Example – Validation Process

70% of data was used to construct the model and

30% was set aside to validate the model after it

was constructed

Both models (probability of best preferred and

probability of decline) had a validation correlation

of over 99%

– Correlations over 90% are considered good fits to the underlying data

(19)

Milliman Example – Scoring Results

A small percentage of policies would be issued

under this program who would have otherwise

been declined

However, this should be more than offset by the

underwriting savings from the policies that are

rapid issued

(20)

Milliman Example – Other Findings

 Traditional factors are stronger predictors for determining the

best preferred class

 Consumer and financial factors are more influential in

determining whether or not to decline

 It was estimated that almost 80% of the top applicants could

be rapid issued under this program

 However, score level could be set wherever company chooses

 The additional non-insurance data proved predictive, but was

most valuable when used with the insurance data (i.e., MIB, MVR, Rx)

(21)

The Future – Big Data, Predictive

Analytics, and Life Insurance

 Electronic Health Records (EHR) and Electronic Medical

Records (EMR)

 Social media

 “Using big data to fight dementia and Alzheimer’s”, The

Globe and Mail, September 15, 2014

 J. Craig Venter plans to amass and electronically analyze

medical, genomic, and metabolic data of 40,000 individuals every year

 Genetics

(22)

Health-Related Wearable Technology

 Some of these are here today. Some will be in the future.

– Wrist band that tracks fitness (steps, fuel, versus friends, light beams) – Headband to calm your mind and keep you focused

– Do an x-ray, eye and ear exam, ultrasound through your phone

– App for measuring obstructive sleep apnea by putting your finger in a sensor and wearing it overnight

– Contact lens that measures glucose levels through tears – Band-aid that records every heartbeat for two weeks

– Put a chip in your bloodstream to warn of a heart attack in the next few days to a couple of weeks

– Vest that has a defibrillator for those at risk for sudden cardiac arrest – Bra that detects breast cancer

– Sweat-wicking gym shirt with 14 muscle-movement sensors, 2 heart rate sensors, and 2 breathing sensors

(23)

Concluding Thoughts

If not already doing so, begin to keep track of your

own detailed data

– Collect everything (e.g., lab results, physical

measurements, ratings, face amount purchased, birthdates, issue dates, claims dates, etc.)

– Look to see how you can best use it

If we, as an industry, do not use the information

available to us, someone else will

(24)

Bio – Al Klein

 Al is a principal and consulting actuary with Milliman’s Bannockburn/Chicago office. He joined in 2009.

 Al’s primary responsibilities include industry experience studies and helping clients with mortality, longevity, and underwriting related issues. This may involve product development, assumption setting, and mergers and acquisitions. Al’s expertise on mortality and underwriting includes traditional products, simplified issue, final expense, older age, and preferred.

 Prior to joining Milliman, Al worked for a large stock life insurance company where he was responsible for experience studies across all lines of business. He has also worked for other life insurance companies, a reinsurer and consultant, where he has been responsible for strategic planning, product development and traditional reinsurance aspects of the business.

 Al is a frequent speaker at industry meetings and currently involved with a number of industry activities, including:

– SOA representative and co-Vice Chair for the Mortality Working Group (MWG) of the International Actuarial Association – MWG Underwriting Sub-group chair – goal is to study underwriting done around the world

– SOA Longevity Advisory Group

– SOA Mortality and Underwriting Survey Committee

– Joint American Academy of Actuaries (AAA) / Society of Actuaries (SOA) Preferred Mortality Oversight Group – Joint AAA / SOA Underwriting Criteria Team

– 2014 SOA Valuation Basic Table (VBT) Development Team – SOA Longevity Calculator Development Team

– Longer Life Foundation Advisory Board

 Al received a Bachelor of Science degree in Actuarial Science and Finance from the University of Illinois, Urbana.

(25)

Predictive Modeling – Real Applications in Life Insurance and Annuities

Credit Models for Life Insurance

SOA Life & Annuity Meeting

New York, NY May 1, 2015

Scott Rushing FSA, MAAA RGA Reinsurance Company Head of Global Research

(26)

Introduction

Purpose

• RGA & TransUnion partnered together to better understand the value of credit data to life insurers and potential applications

Background

• Credit-Based Insurance Scores (CBIS) used in P&C since the 1990’s • Wide adoption in pricing & underwriting for auto and home insurance • Predictive models are built and validated using de-personalized credit

data

Goals of the Model

• To predict mortality

(27)

Introduction

TransUnion – Consolidates data, builds models Collection Agencies Courts Lender / Creditor #1 Lender / Creditor #2 .…. Lender / Creditor #10 Utilities Etc.

Comprehensive reports on individuals

(Scores, Attributes or Full file)

Consumers Landlords P&C Insurers

Life

Insurers Lenders Utilities

Collection Agencies Employers (new hires) Consumer

Credit Reporting Process

(28)

Model Creation

• Built the model

on 44 million lives and >3 million deaths • Started with >800 variables offering features of individual‘s credit history Selected variables that were: • Most predictive of the outcome

• Stable over time

• Non-gameable

• Not too correlated with the other variables • Binary Logistic Regression • Model validated internally using an additional 30 million lives • Age, Gender and Region used as control variables • TransUnion TrueRisk Life presented as a score from: Starting Data Variable Selection Modeling Process External Validation of Model TU TrueRisk Life Score

Building the Model

• Data comes from de-personalized 1998 credit archive (90% of US pop) • Model calibrated to actual deaths occurring over a 12-year period

• Tested model using traditional mortality and lapse studies • Used a random holdout dataset of another 18 million lives 1 to 100 Low Risk High Risk

(29)

Model Validation – Population Study

Population Study

• Mortality study performed on holdout sample of 18 million lives using a 1998 TransUnion archive and studying the lives during 1999-2010

• Score buckets are set to be uniform across the population • Study shows 5 times segmentation (96-100 compared to 1-5)

• SSMDF used as source of deaths; used population mortality tables

0% 50% 100% 150% 200% 250% 1-5 6-10 11 -15 16 -20 21 -25 26 -30 31 -35 36 -40 41 -45 46 -50 51 -55 56 -60 61 -65 66 -70 71 -75 76 -80 81 -85 86 -90 91 -95 96-100 A/ E Re sul ts (A dj us te d B as is )

TU TrueRisk Life Score Overall Mortality Population Study

(30)

Model Validation – Population Study

By Age (as of 1-1-1999)

• Similar shape curves by age band, but the 60-69 curve is slightly flatter than the others

0% 50% 100% 150% 200% 250% 300% 1-5 6-10 11 -15 16 -20 21 -25 26 -30 31 -35 36 -40 41 -45 46 -50 51 -55 56 -60 61 -65 66 -70 71 -75 76 -80 81 -85 86 -90 91 -95 96-100 A/ E Re sul ts (A dj us te d B as is )

TU TrueRisk Life Score Mortality by Age Group

Population Study

(31)

Model Validation – Population Study

By Duration

• Very similar results by duration

0% 50% 100% 150% 200% 250% 300% 1-5 6-10 11 -15 16 -20 21 -25 26 -30 31 -35 36 -40 41 -45 46 -50 51 -55 56 -60 61 -65 66 -70 71 -75 76 -80 81 -85 86 -90 91 -95 96-100 A/ E Re sul ts (A dj us te d B as is )

TU TrueRisk Life Score Mortality by Duration

Population Study

(32)

Model Validation – Insured Lives Study

Insured Data Study

• Important to test the value of TRL on an insured block of business

Details of the Study

• Business Studied: Full UW (term, UL, VUL) and small face WL • Study Period: 2002-2013

• Mortality and Lapse result studied on a count basis • Relative mortality and relative lapse results reported

0% 5% 10% 15% 20% 25% 30% 35% 1-10 11-20 21-30 31-40 41-50 51-60 61-70 71-80 81-90 91-100 TU TrueRisk Life Score

Distribution of Insureds

(Compared to Population)

(33)

Model Validation – Insured Lives Study

Fully Underwritten Mortality Study

Details: Term, UL and VUL; Face Amount ≥ $100k; Issue Ages < 70

Results: Mortality of 91-100 group is 2.6 times higher than 1-10 group

0 50 100 150 200 250 300 350 0% 50% 100% 150% 200% 250% 1-10 11-20 21-30 31-40 41-50 51-60 61-70 71-80 81-90 91-100 Cl ai m Co un t Re la tiv e M or ta lit y

TU TrueRisk Life Score

Overall Mortality Issue Age < 70

(34)

Model Validation – Insured Lives Study

Fully Underwritten Mortality Study

Details: Term, UL and VUL; Face Amount ≥ $100k; Issue Ages < 70

Results: Segmentation exists within risk classes; Mortality for worst TRL scores (71-100) are about double that of best risks (1-10); Most relevant splits may vary by risk class; Non-Smokers are shown, but results are similar for smokers.

0 20 40 60 80 100 120 140 0% 50% 100% 150% 200% 250% 1-10 11-20 21-30 31-40 41-50 51-60 61-70 71-80 81-90 91-10 0 1-10 11-20 21-30 31-40 41-50 51-60 61-70 71-80 81-90 91-10 0 1-10 11-20 21-30 31-40 41-50 51-60 61-70 71-80 81-90 91-10 0

Preferred NS Non-Preferred NS Substandard NS

Cl ai m Co un t Re la tiv e M or ta lit y

TU TrueRisk Life Score

Mortality by Underwriting Class Issue Age < 70

(35)

Model Validation – Insured Lives Study

Fully Underwritten Lapse Study

Details

• Term, UL and VUL

• Face Amount ≥ $100k

• Issue Ages < 70

Results

• Lapse rates of 91-100 group is 6 times higher than 1-10 group in durations 1-2

• Continued segmentation seen in later durations, but less dramatic

• Similar results seen when looking at the curves by issue age band 1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 9,000 0% 100% 200% 300% 400% 500% 600% 700% 1-10 11-20 21-30 31-40 41-50 51-60 61-70 71-80 81-90 91-100 La ps e C ou nt Rel at iv e La ps e Ra te TU TrueRisk_Life_Score

Overall Lapse Results - Durations 1-2 Issue Age < 70

Lapse Count Relative Lapse Rate

2,000 4,000 6,000 8,000 10,000 12,000 14,000 16,000 18,000 0% 50% 100% 150% 200% 250% 1-10 11-20 21-30 31-40 41-50 51-60 61-70 71-80 81-90 91-100 La ps e C ou nt Rel at iv e La ps e Ra te TU TrueRisk_Life_Score

Overall Lapse Results - Durations 3 + Issue Age < 70

(36)

Model Validation – Insured Lives Study

Fully Underwritten Lapse Study

Details: Term, UL and VUL; Face Amount ≥ $100k; Issue Ages < 70

Results: Segmentation of about 6 times seen in first two durations within given risk class; Non-Smokers are shown, but results are similar for smokers

500 1,000 1,500 2,000 2,500 3,000 3,500 4,000 0% 100% 200% 300% 400% 500% 600% 700% 800% 1-10 11 -20 21 -30 31 -40 41 -50 51 -60 61 -70 71 -80 81 -90 91-100 1-10 11 -20 21 -30 31 -40 41 -50 51 -60 61 -70 71 -80 81 -90 91-100 1-10 11 -20 21 -30 31 -40 41 -50 51 -60 61 -70 71 -80 81 -90 91-100

Preferred NS Non-Preferred NS Substandard Non-Smoker

La ps e C ou nt Rel at iv e La ps e Ra te TU TrueRisk_Life_Score

Lapse Results by Non-Smoker UW Class Durations 1-2; Issue Age <70

(37)

Model Validation – Insured Lives Study

Small Face Whole Life Mortality Study

Details

• Includes Whole Life products < $100k face; most of this business is under $25k or $50k • Issue Ages < 70

• Scores above 90 are further split out

Results

• Mortality about 6 times higher for worst scores • Segmentation at higher

scores for this business • 14% of exposure & 29%

of claims have score > 95 • > 10% of the claims have

a score of 100

• Value also seen beyond age 70 0 50 100 150 200 250 300 0% 50% 100% 150% 200% 250% 300% 350% 1-10 11 -20 21 -30 31 -40 41 -50 51 -60 61 -70 71 -80 81 -90 91 -95 96 -99 100 Cl ai m Co un t Re la tiv e M or ta lit y

TU TrueRisk Life Score

Overall Mortality Issue Age < 70

(38)

Model Validation – Insured Lives Study

Small Face Whole Life Lapse Study

Details

• Includes Whole Life products < $100k face; most of this business is under $25k or $50k • Issue Ages < 70

Results

• Significantly higher lapse rates at the higher scores

• Raw lapse rates are much lower for

durations 3+, but there is little segmentation by score 500 1,000 1,500 2,000 2,500 3,000 3,500 4,000 4,500 5,000 0% 50% 100% 150% 200% 250% 300% 350% 400% 1-10 11-20 21-30 31-40 41-50 51-60 61-70 71-80 81-90 91-95 96-99 100 La ps e C ou nt Rel at iv e La ps e Ra te TU TrueRisk_Life_Score

Overall Lapse Results - Durations 1-2 Issue Age < 70

Lapse Count Relative Lapse Rate

200 400 600 800 1,000 1,200 1,400 1,600 1,800 2,000 0% 50% 100% 150% 200% 1-10 11-20 21-30 31-40 41-50 51-60 61-70 71-80 81-90 91-95 96-99 100 La ps e C ou nt Rel at iv e La ps e Ra te TU TrueRisk_Life_Score

Overall Lapse Results - Durations 3 + Issue Age < 70

(39)

Sample of Applications

Batch

segmentation (“pre-approval”) for

new firm life offers

Underwriting Triage Risk Segmentation (beyond traditional medical factors) Modification of existing UW requirements Cross-sell or up-sell existing customers Inforce Policy Management (lapse & mortality)

(40)

Questions??

Scott Rushing FSA, MAAA

RGA Reinsurance Company

Vice President and Actuary – Global R&D Head of Global Research

(41)

SOA Life and Annuity Symposium

Session 6 0 : Predictive Modeling – Real Applications in Life Insurance and Annuities JJ Lane Carroll

May 5 , 2 015

(42)

The State of Indiana stopped $88 million in attempted tax fraud in 2014

My favorite de finition of pre dictive a na lytics India na De pa rtme nt of Re ve nue

C

o

n

fir

m

e

d

 Re gula r proce s s ing of re turns

V

e

rif

ic

a

tio

n

Ta xpaye rs re quire d to ta ke Ide ntity Confirma tion Quiz

F

ra

ud

ul

e

n

t

 Cle a r fra udule nt ca se s se nt to s pe cia l inve s tiga tion unit

(43)

Predictive analytics exa mple s

Case Study #2: Marketing

Case Study #3: Epidemiology Case Study #1: Underwriting

(44)
(45)

Smoker model example within de fine d thre s hold Non-Smoke r Additiona l informa tion ne e de d Smoke r within de fine d thre s hold Fa s t Tra ck Alte rna te Tools / Tra ditiona l proce s s Tra ditiona l proce s s Non-s moke r ra te Non-s moke r ra te Smoke r ra te Smoke r ra te

(46)

• Receiver operating

characteristic (ROC) curves can be used to assess the absolute performance of predictive

models or compare the

performance of several models. • The higher the Area Under the

Curve, the more predictive the model. A value of 0.5 basically means the probability of the event being predicted for a particular applicant is no better than tossing a coin. An AUC above 0.9 is highly predictive. • What does this mean for

insurance decisions?

(47)
(48)

Informa tion ove rloa d

Be haviora l e conomics : informa tion ove rloa d pre ve nts de cis ion ma king

Pote ntia l to incre ase s a le s s imply by ge tting the :

• Right product • Right me s s age • In the right way • At the right time • To the right pe rs on

(49)

Predictive model for marketing segmentation

• More a rt tha n s cie nce

• No cle a n bre a ks be twe e n se gme nts

• Attitude s cha nge

(50)

Case Study #3:

(51)

Lung Ca nce r Smoke r Smoke r Smoke r Smoke r

Ca us a tion

Re pla ce d by

Corre la tion

(52)

S o c ia l M e d ia C h ro n ic D is e a s e s M a p Dise ase s

(53)
(54)
(55)

Legal notice

©2015 Swiss Re. All rights reserved. You are not permitted to create any modifications or derivative works of this presentation or to use it for commercial or other public purposes without the prior written permission of Swiss Re.

The information and opinions contained in the presentation are provided as at the date of the presentation and are subject to change without notice. Although the information used was taken from reliable sources, Swiss Re does not accept any responsibility for the accuracy or comprehensiveness of the details given. All liability for the accuracy and completeness thereof or for any damage or loss resulting from the use of the information contained in this presentation is expressly excluded. Under no circumstances shall Swiss Re or its Group companies be liable for any financial or consequential loss relating to this presentation.

Figure

Updating...

References

Related subjects :