• No results found

How To Build A Predictive Model In Insurance

N/A
N/A
Protected

Academic year: 2021

Share "How To Build A Predictive Model In Insurance"

Copied!
27
0
0

Loading.... (view fulltext now)

Full text

(1)

University of Minnesota

November 9th, 2012

Nathan Hubbell, FCAS Katy Micek, Ph.D.

The “Do’s & Don’ts” of Building

(2)

Agenda

• Travelers

– Broad Overview

– Actuarial & Analytics Career Opportunities

• Actuarial & Analytics Leadership Development Program (AALDP)

• Do’s and Don’ts of Generalized Linear Models (GLMs)

– Insurance Background and Motivation – Failings of Multiple Linear Regression – Basics of GLMs

– Over-fitting

(3)

• Offers property and casualty solutions to individuals and companies

• Second-largest commercial insurer in the U.S.

• Second-largest personal insurer through the independent agency channel

• Representatives in every U.S. state, Canada, Ireland and the United Kingdom

• A member of the Dow Jones Industrial Average – the only insurance

company on the list

(4)

Analytics at Travelers – Who are we?

Across Travelers, we form a large (300+) and diverse community of Ph.D., Masters and Bachelors holders in the following disciplines:

mathematics statistics physics actuarial science computer science business … and more!

(5)

Actuarial & Analytics Leadership Development Program

(AALDP)

• 5-yr program for actuarial students and advanced analytics • Actuarial track offers exam support

• Analytics learn insurance on the job through work projects and seminars

Exams not required but support is provided for those interested

• Leadership development opportunities

• Career exploration opportunities through rotations

• Networking opportunities (mentor program, committees) • 2012 – Pilot Class for Advanced Analytics

(6)
(7)

Basics of Insurance

Insurance companies sell the product of insurance policies, which are the promise to pay in the event that a customer experiences a loss.

The unique challenge in insurance is that we don’t know what the cost of insuring a customer is when we sell the policy.

Example: The cost to insure an auto customer

It’s impossible to predict if someone is going to • Get into an accident

• The type of accident (hit a telephone pole, hit another vehicle, bodily injury) • How bad (cost) the accident will be

(8)

Business Impact of Loss Experience

To estimate the cost of insuring policyholders, we must predict losses Two fundamental questions we must answer are:

1. Ratemaking: looking to the future

• Setting rates for policies

• How much do we need to charge customers for a policy in order to reach our target profit?

Basic idea: price = cost + profit

2. Reserving: looking at the impact of past experience

• Setting aside reserve money

• How much money do we need to set aside to pay for claims?

Note: We cannot predict losses for each individual. However, if we group our

customers together, we can build statistical models to predict average loss over a group.

(9)

First Model Attempt:

Multiple Linear Regression / Ordinary Least Squares

E[Y] = a0 + a1X1 + …+ anXn

Goal: Fit a linear relationship between the predictors (X1, …, Xn) and the response variable Y.

Assumptions:

1. Y is normally distributed.

2. The variance of Y is constant.

Approach: The parameters (a0, a1, …, an) can be estimated by minimizing the sum of

squared errors. X, the predictor Y, t h e re sp o n se

(10)

Oops - DON’T assume Y is normally distributed

In insurance, we study loss experience in terms of claims.

Two aspects of claims must be considered.

1. Frequency: what is the rate that claims are being made?

2. Severity: what is the average size of claim?

The underlying distribution in the model depends on what aspect of the loss

experience we’re investigating.

(11)

Double Oops - DON’T assume the Variance of Y is constant

– Varies by expected loss.

• High frequency losses have less variance.

(12)

DO assume an exponential family distribution for Y

Poisson - claim frequency

• Discrete distribution • Time-invariant

• Variance equals mean (m = E[Y]) • Gamma - severity

• Continuous distribution

• Variance equals mean squared (m2 = E[Y]2)

Note: Non-normal distributions are more suited to highly skewed claim data

Gamma Distribution

(13)

Suitable Model Framework:

Generalized Linear Models (GLMs)

E[Y] = g-1(a

0 + a1X1 + …+ anXn)

where g(x) is the link function.

Goal: fit a non-linear relationship between the

predictors (X1, … , Xn) and the response variable Y. Assumptions:

1. Y can be from any exponential family of

distributions.

2. Variance depends on expected mean.

Approach: The parameters (a0, a1, …, an) can be estimated using maximum likelihood when

(14)

Exponential Family ABCs

(15)

Exponential Family – Mean and Variance

(16)

Frequency & Severity All-in-One: Uncle Tweedie

(17)
(18)

DON’T Underfit

• First let’s start with under-fitting

– Expected Auto Claim Frequency: 5% – Expected Auto Loss: $10,000

– Expected Premium = 5% * $10,000 = $500 – Competitors use a segmented rate plan

• What happens next?

– We will “win” all of the business where competitors charge more than $500 – We “lose” all of the business where competitors charge less than $500

– Now why would a competitor want to charge more than the average? – Hmmm… perhaps we need a better approach

(19)
(20)

However, DON’T Overfit

• A noisy model = similar problems to under-fitting

• Some things you might use to look at fit:

– Summary Statistics (R2, AIC, BIC)

– Deviance Tests / Type III

– P-value / Parameter Estimate Standard Errors – ROC / Lift / Gain Charts

• However, let’s keep our eyes on the prize:

(21)

DO Fit on Non-Training (Future!) Data

• Cross Validation

– Validation and Test / Hold out – Bias / Variance Tradeoff

(22)

Cross Validation

Model Fit

Training

Validation

Model Stucture

Testing

(Holdout)

Final Model Testing

(23)
(24)

Bias/variance: How would you fit this model?

Price

(25)

Bias vs. Variance

High bias

(under-fit)

Price Size

High variance

(over-fit)

Pric e Size

“Just right”

Pric e Size

(26)

Regularization

High variance

(overfit)

Pric e Size

(27)

References and Resources

• Actuarial Exams and Career Information

http://www.beanactuary.org/

http://www.casact.org/

http://www.soa.org/

• Travelers Careers

http://www.travelers.com/careers

– Actuarial and Analytics Research Internship and Full Time

• A Practitioner's Guide to Generalized Linear Models

http://www.towerswatson.com/assets/pdf/2380/Anderson_et_al_Edi

tion_3.pdf

• Statistical Modeling: The Two Cultures (Leo Breiman)

http://www.recognition.su/wiki/images/8/85/Breiman01stat-ml.pdf

• Elements of Statistical Learning (Hastie, Tibshirani, Friedman)

http://www-stat.stanford.edu/~tibs/ElemStatLearn/download.html

References

Related documents

burden, chronic disease, hemodialysis, caregiver stress and coping, physical strain, social strain, emotional strain, qualitative and caregiving, phenomenology and caregiving,

Porcupine Tree – Lazarus from the album "Deadwing" 2005 written by Steven Wilson arranged by Vika Yermolyeva www.vkgoeswild.com... in' from out bleed moon

41 Berenson possessed three sixteenth-century copies of Doni’s book list (La libraria di Doni, 1550–51). Although one might therefore think that Berenson was deeply impressed by

The departures level to newark airport airline terminals a flight services; green signs direct passengers or make a domestic and international can make use.. pm relief fund

The Student Advising Office can provide lists of course requirements from the various colleges, identifying which DMACC courses should be taken for college transfer.. Students

• develop and assess pupils’ understanding of concepts through the use of interactive technology, for example CD-ROMs, electronic multi-media display boards and e-panels • develop

Shipping companies should have the appropriate decision support tools to optimize their crew repository depth, the long term assignment, the seafarers satisfaction

The study aimed to identify the curriculum and pedagogical needs of graduate students from Saudi Arabia who were studying hospitality and tourism management in the