• No results found

Session 15 PD, Predictive Analytics for Actuaries: Predictive Modeling of Health Insurance Big Data. Moderator: Brian Matthew Hartman, ASA, Ph.D.

N/A
N/A
Protected

Academic year: 2021

Share "Session 15 PD, Predictive Analytics for Actuaries: Predictive Modeling of Health Insurance Big Data. Moderator: Brian Matthew Hartman, ASA, Ph.D."

Copied!
55
0
0

Loading.... (view fulltext now)

Full text

(1)

 

 

 

 

Session

 

15

 

PD,

 

Predictive

 

Analytics

 

for

 

Actuaries:

 

Predictive

 

Modeling

 

of

 

Health

 

Insurance

 

Big

 

Data

 

 

Moderator:

 

Brian

 

Matthew

 

Hartman,

 

ASA,

 

Ph.D.

 

 

Presenters:

 

Brian

 

Matthew

 

Hartman,

 

ASA,

 

Ph.D.

 

(2)

Session 15: Predictive

Analytics for Actuaries:

Predictive Modeling of

Health Insurance Big Data

Brian Hartman, ASA, Ph.D.

Chris Stehno

(3)

Big Data

What is it – and what does it mean for the insurance industry?

SOA Annual Meeting

Austin

October 12, 2015

Chris Stehno

Deloitte Consulting | US

(4)

Chris Stehno – Director - Advanced Analytics and Modeling

Chris Stehno is a Director at Deloitte Consulting in the

United States, as well as a member of Deloitte’s

Advanced Analytics and Modeling practice.

Chris has applied statistical and machine learning

methods to such diverse business problems as

healthcare utilization, customer and employee

retention, talent management, insurance agent

recruiting, customer segmentation, life and health

insurance pricing and underwriting, medical

malpractice and patient safety, claims management,

preventive healthcare, suicide prevention and fraud

detection. He is well known for the expansion of

traditional health risk analysis through the use of

non-traditional data sources and for developing behavioral

tactics to promote wellness and preventative services.

Chris is a frequent author and conference speaker.

Prior to Deloitte, he was the co-founder and President

of MedAnalytics, the company that pioneered the field

of Lifestyle Based Analytics in the healthcare arena.

Deloitte Consulting LLP

111 S Wacker Dr

Chicago

, IL 60606

Chris Stehno, MBA

Director

Deloitte Consulting | US

Tel: +1 312 206 4024

[email protected]

Member of

(5)

Big Data is in the news

From the dawn of civilization until 2003, humankind

generated 5 exabytes of data. Now we produce 5

exabytes every two days, and the pace is

increasing.

Eric Schmidt, Executive Chairman, Google

Every century a new technology – steam power,

electricity, atomic energy or microprocessors – has

swept away the old world with a vision for a new

one. Today, we seem to be entering the age of big

data.

Michael Cohen, Author, Speaker, Broadcaster

We’ll see this as a the time in history when the

world’s information was transformed from an inert,

passive state and put into a unified system that

brings the information alive and lives on forever.

(6)

The evolution of data science in insurance

1990s

Credit Scoring - an

early bellwether of the

disruptive power of

“big data” in insurance.

(7)

The evolution of data science in insurance

1990s

Credit Scoring - an

early bellwether of the

disruptive power of

“big data” in insurance.

2000s

Predictive modeling

becomes mainstream in

non-life insurance.

Personal insurance:

rating

plan and price optimization

Commercial insurance:

Underwriting, prospecting,

claim adjustment models

(8)

The evolution of data science in insurance

1990s

Credit Scoring - an

early bellwether of the

disruptive power of

“big data” in insurance.

2000s

Predictive modeling

becomes mainstream in

non-life insurance.

Personal insurance:

rating

plan and price optimization

Commercial insurance:

Underwriting, prospecting,

claim adjustment models

Today and tomorrow

Health / Life insurance:

Underwriting/risk triage

Application triage models

In-force management models.

Use of analytics to better

understand risks at

individual (not just group)

level

Telematics and self-tracking

devices link insureds to the

Internet of Things [IoT].

New data sources, new

business models …

(9)

Big data:

(10)

Three definitions of big data

1. Data sets with sizes beyond the capability of

standard IT tools to capture, process, and analyze

in reasonable time frames.

(11)

Three definitions of big data

1. Data sets with sizes beyond the capability of

standard IT tools to capture, process, and analyze

in reasonable time frames.

2. Data with high

V

olume,

V

elocity,

V

ariety

• Huge datasets

• … emanating continuously from smart phones, sensors,

cameras, GPS devices, computers, TVs, …

(12)

Three definitions of big data

1. Data sets with sizes beyond the capability of

standard IT tools to capture, process, and analyze

in reasonable time frames.

2. Data with high

V

olume,

V

elocity,

V

ariety

• Huge datasets

• … emanating continuously from smart phones, sensors,

cameras, GPS devices, computers, TVs, …

• … involving all manner of numeric, text, photographic data

(13)

Emerging trends - social analytics, quantified self, etc.

Digital Analytics

leverages a number

of different tools to collect social

conversations, then uses a

combination of automated and manual

processes to analyze the data.

Quantified Self

” applications such

as Fitbit, Apple Watch and Smart

Phone Apps allow customers to

monitor and share lifestyle/health

data

(14)

The development of Lifestyle Based Analytics (LBA)

Deloitte Consulting’s Proprietary Disease

State Algorithms

Using only third-party data we have built

algorithms to provide insights into individuals

afflicted with 20 plus lifestyle diseases (e.g.

diabetes, female cancer, tobacco related cancer,

cardiovascular, depression, etc.) which impact

morbidity. In addition we have used over 1 million

paramedical exam results to identify individuals

who are at extreme risk or have a condition that

has not been otherwise detected or diagnosed.

3

rd

Party Marketing Data Types

Disease State

1

Algorithms

Survey Data

– Self-reported information collected over the last 18 months

– Contains many lifestyle elements

Observed Data:

– Basic individual and household demographics

Age, sex, number and ages of children, marital status

Occupation categories, education level

– Financial information

Income level, net worth, savings and investments

Home value, mortgage value

– Lifestyle data

Activity — running, golf, tennis, biking, hiking, soccer, tri-athlete

Inactivity — TV, mail-order, computers, video games, casino gambling

Diet, weight-loss, exercise, cooking, gardening, health foods, pets

Small Area Characteristics:

– Matched to carrier route modeled data

– Reports average data for that route

– Approximately two city blocks

1 – Deloitte Consulting proprietary method

Third party marketing datasets are used to develop health-related algorithms. These datasets include over 1,000 fields of

data and the match rate with a client’s policyholders is typically around 95% based only on the individual’s name and address.

(15)

Lifestyle predictive analytics allow us to better understand

individual/population health risks

Beth

Tom

Sarah

• Female age 45

• Employed

• No significant claims

• Male age 46

• Employed

• Knee surgery

• Female age 46

• Unemployed

• No significant claims

T

ra

d

it

io

n

a

l d

a

ta

T

ra

di

ti

on

a

l

da

ta

• Renter / Owner

• Commutes 45 miles

• Bankruptcy indicator

• Diet/weight loss

purchases

• Fast food purchaser

• Self help books

• High TV consumption

• Manager level position

• Owns home

• Has lived in hometown

all his life

• Married with two children

• “Suburban Striver”

Psychographic Cluster

• Avid golfer

• New to town

• Reading: foreign

travel-related magazines

• Good credit

• Healthy food choices

• Little to no TV

consumption

• Running and yoga

L

if

est

y

le

-b

ased

d

at

a set

Cost index: 1.3

Diabetes Prob: 2.5

Average Cost

Expectation

Cost Index: 0.75

Diabetes Prob: 0.30

Risk

an

aly

sis

(16)

EHR – A current federal mandate

• American Recovery and Reinvestment Act of 2009

("stimulus package”)

• Established timeline for future incentives for health care providers

to offer patient health records in electronic format

• Healthcare providers which demonstrate “meaningful use” of

EHR receive increased levels of Medicaid and Medicare

reimbursements

2009

New Legislation

2011 – 2014

Positive Rewards /

Incentives for

“Meaningful Use”

of EHR

2015 +

Penalties

for Lack of

“Meaningful Use”

of EHR

• Healthcare providers which fail to demonstrate “meaningful use”

of EHR receive reduced levels of Medicaid and Medicare

(17)

Future disruptors – electronic medical/health records

The next big disrupter in

insurance market place

113883.10.20.3" /><templateId root="2.16.840.1.113883.10.20.1" /><templateId root="2.16.840.1.113883.3.88.11.32.1" /><templateId root="1.3.6.1.4.1.19376.1.5.3.1.1.5" /><templateId root="1.3.6.1.4.1.19376.1.5.3.1.1.2" /><templateId root="1.3.6.1.4.1.19376.1.5.3.1.1.1" /><templateId root="1.2.840.114350.1.72.1.77896772" /><id assigningAuthorityName="EPC" root="1.2.840.114350.1.13.126.2.7.8.688883.4663085" /><code code="34133-9" codeSystem="2.16.840.1.113883.6.1" codeSystemName="LOINC" dispxxxxxxName="Summarization of Episode Note" /><title>Patient Health Summary</title><effectiveTime value="20150507131421-0700" /><confidentialityCode code="N" codeSystem="2.16.840.1.113883.5.25" /><languageCode code="en-US" /><setId assigningAuthorityName="EPC" extension="a5af8356-f4f5-11e4-91ee-f802765cdbf3:121222" root="1.2.840.114350.1.13.126.2.7.1.1" /><versionNumber value="1" /><recordTarget><patientRole><id root="1.2.840.114350.1.13.126.2.7.3.688884.100" extension="K263709407" /><addr

use="HP"><streetAddressLine>XXXXXXXXXXX</streetAddressLine><city>XXXXXXXXXXX</city><state>CA</state><postalCode>91311-5538</postalCode><country>USA</country></addr><telecom use="HP" value="tel:+1-NNNXXXX-7743" /><telecom use="WP" value="tel:+1-NNNXXXX-2544" /><telecom use="MC" value="tel:+1-NNNXXXX-7743" /><telecom value="tel:+1-000-000-0000" /><telecom value="mailto://[email protected]" /><patient><name

use="L"><given>First</given><family>XXXXXX</family></name><administrativeGenderCode code="F" codeSystem="2.16.840.1.113883.5.1" codeSystemName="AdministrativeGenderCode" displayName="Female" /><birthTime value="19770101" /><maritalStatusCode code="M" codeSystem="2.16.840.1.113883.5.2" codeSystemName="MaritalStatusCode" displayName="Married" /><religiousAffiliationCode code="970" codeSystem="2.16.840.1.113883.5.1076" codeSystemName="ReligiousAffiliation" /><raceCode code="2106-3" codeSystem="2.16.840.1.113883.6.238" codeSystemName="CDC Race and Ethnicity" displayName="White" /><ethnicGroupCode code="2186-5" codeSystem="2.16.840.1.113883.6.238" codeSystemName="CDC Race and Ethnicity" displayName="American/United States" /><languageCommunication><languageCode code="eng" /><preferenceInd value="true"

/></languageCommunication><languageCommunication><languageCode code="eng" /><modeCode code="EWR" codeSystem="2.16.840.1.113883.5.60" displayName="Expressed Written" /></languageCommunication></patient><providerOrganization><id root="1.2.840.114350.1.13.126.2.7.2.688879" extension="6100" /><name>Kaiser Permanente Southern California</name><telecom use="WP" value="tel:+1-858-614-3333" /><addr

nullFlavor="NA"><city>Pasadena</city><state>CA</state><postalCode>91103</postalCode></addr></providerOrganization></patientRole></recordTarget><author><time

value="20150507131421" /><assignedAuthor><id nullFlavor="NA" /><addr nullFlavor="NA"><streetAddressLine nullFlavor="UNK" /><city nullFlavor="UNK" /></addr><telecom nullFlavor="NA" /><assignedPerson><name nullFlavor="NA" /></assignedPerson><representedOrganization><id root="1.2.840.114350.1.13.126.2.7.2.688879" extension="6100" /><name>Kaiser Permanente Southern California</name><telecom use="WP" value="tel:+1-858-614-3333" /><addr

nullFlavor="NA"><city>Pasadena</city><state>CA</state><postalCode>91103</postalCode></addr></representedOrganization></assignedAuthor></author><author><time value="20150507131421" /><assignedAuthor><id root="1.2.840.114350.1.1" extension="8.1" /><addr nullFlavor="NA"><streetAddressLine nullFlavor="UNK" /><city nullFlavor="UNK" /></addr><telecom nullFlavor="NA" /><assignedAuthoringDevice><manufacturerModelName>Epic - Version 8.1</manufacturerModelName><softwareName>Epic - Version 8.1</softwareName></assignedAuthoringDevice><representedOrganization><id root="1.2.840.114350.1.13.126.2.7.2.688879" extension="6100" /><name>Kaiser Permanente Southern California</name><telecom use="WP" value="tel:+1-858-614-3333" /><addr

nullFlavor="NA"><city>Pasadena</city><state>CA</state><postalCode>91103</postalCode></addr></representedOrganization></assignedAuthor></author><custodian><assignedCustodian> <representedCustodianOrganization><id root="1.2.840.114350.1.13.126.2.7.2.688879" extension="6100" /><name>Kaiser Permanente Southern California</name><telecom use="WP" value="tel:+1-858-614-3333" /><addr

nullFlavor="NA"><city>Pasadena</city><state>CA</state><postalCode>91103</postalCode></addr></representedCustodianOrganization></assignedCustodian></custodian><participant typeCode="IND"><templateId root="2.16.840.1.113883.3.88.11.32.3" /><templateId root="2.16.840.1.113883.3.88.11.83.3" /><templateId root="1.3.6.1.4.1.19376.1.5.3.1.2.4" /><time value="20130207" /><associatedEntity classCode="ECON"><id nullFlavor="UNK" /><code

nullFlavor="OTH"><originalText>Husband</originalText></code><addr><country>USA</country></addr><telecom use="MC" value="tel:+1-NNN-421-7000" /><associatedPerson><name use="P"><given>GIANN</given></name></associatedPerson></associatedEntity></participant><documentationOf typeCode="DOC"><serviceEvent classCode="PCPR"><effectiveTime><low value="19770101" /><high value="20150507" /></effectiveTime><performer typeCode="PRF"><templateId root="2.16.840.1.113883.3.88.11.32.4" /><templateId root="2.16.840

(18)

EHR pilot project scope

The scope of the pilot project is to implement the processes necessary to extract the raw data from various TPP websites

and deliver it to the client. Afterwards, there are various phases of work around utilizing EHR data to improve risk

analysis.

Pilot Project Scope

Post Pilot Phase

Applicant/agent is notified via

the Client landing page

Redirected to

Validation

Landing Page

On the validation landing

page, the user will

input their credentials

used to access their EHR.

Login

Credentials are

Validated

ETL using Validated

Credentials

De-identified Industry

Database

Discrete Client

Database

Mapping and

Normalization

EHR

Data Extract

for Underwriting

Producer

Order medical

requirements

Approved. No medical

requirements needed

Does not qualify for life

insurance

Deliver

Raw Data

via FTP

Automated Underwriting

CCD

Reporting,

Analytics,

and

Benchmarking

Big Data

An automated scripting tool is

(19)

Big data:

(20)

The pulse of the nation

(21)

More on Google Flu Trends:

(22)
(23)

Vendors and Data Available for Predictive Analytics

17 lifestyle diseases

Including:

Diabetes, Cancers,

Cardiovascular,

Depression/Mental,

Hypertension, etc.

Purchase Propensities

Spend by Category

DTC Spend by Retailer

Brand Usage Statistics

Retailer Trans Data

Purchase Triggers

Crime Statistics

Hail Vector Data

Storm Events DB

Climate Data

Geographic Mapping

Firehouse Data

Fire Incident Data

Fed. Case Law DB

Florida Tax Records

Lit. Trends Survey

Lawsuit Climate Data

DUI/DWI Laws

CA/FL Lawyer Data

Tort Liability Index

Credit score

Bankruptcy

Liens and Loans

Deliquencies

Credit Available

Credit Lines Open

Payment Patterns

Disability Data

US Hospital Directory

Nursing Home Data

Medical Provider Data

Hosptial Visit Statistics

Doctor Practice Data

Health Interest Data

Auto Data

Carfax Vehicle History

Motor Vehicle Reports

Auto Injury / Loss Data

Driver Device Usage

Road Rage Survey

VIN Decoding Data

Lifestyle and Life Traits

Working Mothers

Active Seniors

High-Tech Segments

Life Stage Clustering

Demo. Census Data

Data Vendors

Acxiom

AM Best

AMA

American Housing Survey

American Tort Reform Foundation

Burueau of Labor Statistics

Cap Index

Carfax

CDS Hail Database

Census Point

Choicepoint

Corporate Research Board

DataLister

Directory of US Hospitals

Dun & Bradstreet

EASI Analytics

EEOC

Equifax

ESRI

Experian

Fastcase Legal Research System

Florida Tax Assessment Records

Fulbright Lititgation Trends Survey

Insurance Information Institute

Insurance Institue for Highway Safety

Internal Renvue Service

Knowlege Based Marketing (KBM)

Lawyer Data – Florida & California

LexisNexis

Martindale/Hubble Attorney Listing

MRI Purchasing Propensities

NFIRS – National Fire Reporting

NHTSA

OSHA

Representative Data Categories

Wage Data

Wealth Indicators

Unemployment Stats

EEOC Complaints DB

Ec. Freedom Index

Aggregated IRS Data

Occupational Codes

National Indices

Deloitte Disease States

Purchase Behaviors

Ailment & Discharge

Automobile

Lifestyle Clusters

Geographical Sets

Judicial / Legal

FCRA / Credit

Companies who are succeeding in advanced analytic analysis are doing so by their commitment to exploring new data.

This commitment has resulted in an approach that leverages the use of both internal and external data to achieve

maximum segmentation. We have established relationships with over 100 external data vendors/sets.

(24)

What are we Missing in Accurate Mortality Assessment?

For a 30 year old buying a 20 yr term policy, death from unintentional

injury, suicide and homicide makes up over 35% of the total deaths

from issue to age 54. Yet how are we accounting for this?

(25)

Health plans must quickly switch from patient centric view

to a consumer centric view

Propensity to

engage

Propensity to

change

Targeted and personalized

interactions and interventions

High impact

engagement

opportunity

Understand

individual’s

preferred mode for

engagement and

their likelihood for

engaging

Understand whether

the individual is in

the right mindset

for change

(26)

Customer Insights dataset

We have scored the entire United States adult

population

(220 million lives)

, giving clients the

ability to identify the markets and individuals within

those markets who are most likely to buy and most

likely to qualify – driving to higher response and

approval rates

Our analytics is powered by 150+ algorithms including:

Disease state algorithms

Lifestyle clusters

Purchase behaviors

Propensity to need and purchase insurance

150 +

We have access to over 50 terabytes of third-party

data that provides individual-level lifestyle and

purchasing habit insights across the entire United

States population

50+ TB

CI

We can apply advanced analytic techniques to get closer to the actual health, needs,

behaviours and value of your customers

(27)

The art of the possible for prediction and prevention strategies

John Smith

• 48 year old male

• Married to Jane with 3 children at home

• Undergraduate degree; employed in white

collar job

• Group health insurance

• Hobbies include college and professional

sports viewing and craft beers

Health risks include

• Increased risks for diabetes,

cardiovascular and hypertension

• Low risk for tobacco related

illnesses

Actions: Schedule full annual

including complete lab with Dr. Smith

Jane Smith

• 45 year old female

• Married to John with 3 children ages 14, 12

and 9

• Master’s degree in business; employed as

stay-at-home parent

• Hobbies include team tennis, yoga, and 5k

and 10k running

Health risks include

• Increased risks for skin cancer

• Low risk for diabetes,

cardiovascular, and maternity event

Actions: Send a sun-blocking tennis

hat, skin cancer educational

brochure and coupon for sunscreen

Health risks include

• Increased risks for asthma

Smith children

• Multiple pets in the household

Actions: Administer spirometer test

at next physical

(28)
(29)

• How we drive

• What we buy

• What we eat

• What we watch, read

• What / how we opine

• Where we travel

• Whom we know / networks

• How we socialize

• How we surf the web

The resulting data can be a major source

of operational improvements and business

innovation…

… and societal change…

Digital footprints

(30)

The real reason why big data is a big deal

“I believe that the power of Big Data is that it is information

about people's behavior instead of information about their

beliefs… This sort of Big Data comes from things like

location data off of your cell phone or credit card, it's the

little

data breadcrumbs

that you leave behind you as you

move around in the world.

…those breadcrumbs tell… the story of your life... Big data

is increasingly about real behavior, and by analyzing this

sort of data, scientists can tell an enormous amount about

you. They can tell whether you are the sort of person who

will pay back loans. They can tell you if you're likely to get

diabetes”

—Sandy Pentland, MIT Media Lab

“Reinventing Society in the Wake of Big Data”

(31)

The Last Mile Problem

(32)

A new focus: “the last mile problem”

Predictive models can point

us in the right direction…

they can tell us whom to

target…

but they don’t tell us how

to prompt the desired

(33)

Yes they did

Motivating example:

the 2012 Obama reelection

campaign used predictive models to identify whom to

(34)

Yes they did

Motivating example:

the 2012 Obama reelection

campaign used predictive models to identify whom to

target

.

It also used behavioral insights to more effectively

act upon the predictive model indications

.

(35)

Resisting the siren song – commitment devices

Self-tracking devices help us quantify different aspects of our diet,

exercise, and sleep behavior

(36)

Resisting the siren song – commitment devices

Self-tracking devices help us quantify different aspects of our diet,

exercise, and sleep behavior

But what to do with all of this data?

Self-tracking data can be fed into commitment contract apps that

help nudge our “present self” to take actions that our “future

self” will be happy with

(37)

Lifestyle data to predict lifestyle diseases

Lifestyle and medical data can used to

predict individuals’ healthcare utilization

and likelihood of various disease states.

(38)

The last mile problem

Lifestyle and medical data can used to

predict individuals’ healthcare utilization

and likelihood of various disease states.

But once we’ve identified the highest

risks, what can be done to change

behavior?

(39)

House calls and health coaching

Lifestyle and medical data can used to

predict individuals’ healthcare utilization

and likelihood of various disease states.

But once we’ve identified the highest

risks, what can be done to change

behavior?

Promising behavioral strategies:

Health coaches for comorbid seniors

“House calls” for root cause analysis of

hospital ER “frequent fliers”

Workplace health initiatives

“Social Physics”

(40)

House calls and health coaching

Lifestyle and medical data can used to

predict individuals’ healthcare utilization

and likelihood of various disease states.

But once we’ve identified the highest

risks, what can be done to change

behavior?

Promising behavioral strategies:

Health coaches for comorbid seniors

“House calls” for root cause analysis of

hospital ER “frequent fliers”

Workplace health initiatives

“Social Physics”

Another Thought: analytics could be

used to guide the hiring and matching

of health coaches with patients

.

(41)
(42)
(43)

Achieve consumer centric goals in long term sustainability

and success in the marketplace

Financial Sustainability

Growing enrollment and

achieving long term

sustainability

Health Management

Improving health risk of

the enrolled population

Product Management

Provide consumer choice

and affordability

Target the right

population segment

:

Identification of potential

new members and

uninsured profiles in the

Individual market and

identify potential employer

participation in Small and

Mid-sized Group market

and preferred benefit levels

Retain enrollment

using

retention analytics to

identify patterns of

disenrollment

Expand market segments

by developing strategies for

group expansions (e.g.

inclusion of unions,

Taft-•

Provide deeper

understanding of health

improvement outcomes

year over year

Target population with

high prevalence and

response rate

for change

management

Develop cost trends

to

identify rate level drivers

such as population risk,

pent up demand and

disease conditions

prevalence

Develop marketing

campaigns

with outreach

methodology suitable for

target consumers

Provide guided whole

consumer shopping

experience tool

(i.e. plan

selection tools) for

enhanced consumer

education and shopping

experience

Support pro-active

engagement

throughout

the coverage period

Improve member and

provider relationship

using provider matching

Identify consumer

demands

on new plans

(e.g. HIX metallic level

plans)

Develop price elasticity of

demand

to support rate

making review

Provide quantifiable

measures on population

health risk

for rate setting

and review support

Consumer Experience

Enhance the consumer

experience from end-to-end

(44)

Model Selection and

Averaging of Health Costs in

Episode Treatment Groups

Brian Hartman, Brigham Young University-Provo

Joint work with

Shujuan Huang, Liberty Mutual Seattle

(45)

Motivation

Health insurance companies are interested in predicting the severity

of claims by disease

Critical illness insurance

(46)

Data for selected ETGs (320 total)

ETG

Frequency

ETG Description

Mean

Std. Dev.

1301

13 534 AIDS

15 570

25 246

1635

2 679 Hyper-functioning adrenal gland

2 035

8 963

1640

1 162 Hypo-functioning parathyroid gland

1 704

6 314

2068

16 554 Agranulocytosis

4 677

17 923

2070

822 Hemophilia

94 343

303 552

2080

944 Anemia of chronic diseases

2 434

10 943

2082

49 409 Iron deficiency anemia

1 772

5 208

2394

1 550 Personality disorder

1 718

5 263

3868

42 401 Congestive heart failure

10 870

56 777

4370

50 Lung transplant

461 226

338 683

(47)

Justification

These ETGs are not going to be

properly fit by the same

distribution.

(48)

Candidate distributions

All the distributions we consider are continuous and defined for

positive numbers.

Gamma – Standard distribution in severity modeling, whether in P&C

or health. Relatively light tail. Two parameters, so relatively inflexible.

Lognormal – Standard distribution in financial modeling. Relatively

light tail. Two parameters.

Lomax – Heavy-tailed distribution, two parameters.

(49)

Model Selection/Averaging Methodologies

1. AIC/BIC weights

2. Bayesian Model Averaging through Parallel Model Selection

3. Random Forest Feature Classification

Moment-based characteristics (e.g., mean, standard deviation, coefficient of

variation, skewness, and kurtosis) for raw data and the same measures for

log-data.

Percentile-based characteristics (e.g., 10th, 25th, 50th, 75th, 90th

percentiles, median absolute deviation, and interquartile range) for raw data

and the same measures for log-data

(50)

Simulation Study

1. Picked a random ETG from our dataset.

2. Fit each of the four candidate models to the ETG data.

3. From each model fit, generate 400 datasets of 500 observations

each.

4. Use each of our three techniques to select the best model.

5. Compare the results to the true model.

(51)
(52)
(53)

Data Analysis

We obtained de-identified data from a major health insurer.

33 million claim amounts from 9 million policyholders.

(54)
(55)

Model Selection and

Averaging of Health Costs in

Episode Treatment Groups

Shujuan Huang, Liberty Mutual Seattle

Brian Hartman, Brigham Young University-Provo

Vytaras Brazauskas, University of Wisconsin-Milwaukee

References

Related documents

This incl udes not only volca nic eruptions but a lso the deep-seated intrusion of granites a nd other rocks ( p. These three processes act so that at any time the form

Concentration of daily rainfall obtained through the CI confirmed that south and central Chile have similar values as those of the Iberian Peninsula, high values of CI for

The policy provides 3 levels of lifetime insurance cover for cats subject to certain terms and conditions being met.. Significant features

A well-prepared child is not only a better, more confident witness but is also more likely to find the experience of giving evidence in court a positive step towards regaining

Previous studies have reported estimates of gaming revenue from casino-style games added to existing race tracks. Other reports and studies have examined the potential revenue

I We also consider a noisy variant with results concerning the asymptotic behaviour of the MLE. Ajay Jasra Estimation of

On the island of Kauai, where tree density is lower and search costs higher, optimal policy calls for deferring removal expenditures until the steady state population is