2
Big Data
I. Definition
II. Growth in Storage, Computing Power and Data III. Effect on the Organization -- Tasks and Roles IV. Big Data’s Impact on Other Industries
V. Data has always been crucial to insurance. How is Big Data different?
VI. Important Elements in the Insurance Environment VII. Insurance Big Data Example – Usage Based
Definition
• At what point does:
4
Definition
• At what point does:
“Lots of” become “Big”?
• Typically definition is operational, i.e. working with Big Data requires the coordination of multiple
Definition
• At what point does:
“Lots of” become “Big”?
• Typically definition is operational, i.e. working with Big Data requires the coordination of multiple
computers
6
McKinsey View on Big Data
Effectiveness
Significant Improvement is Possible
8
Trends Affecting Big Data
Cost to process and store data is shrinking Process - Moore’s Law
Storage - 2010
9 7 exabytes in disk storage
9 6 exabytes on PCs, notebooks and mobile devices
9 Average company 200 terabytes
2010 World Data Storage Capabilities
Other
Disk
10
Database Size
Name Power of Ten Example
Kilobyte 3 Xmas card list is about 500 kb in Access Megabyte 6 This PowerPoint is 2.9 meg
Gigabyte 9 Clients routinely pass my team 5 Gigabyte files representing a month’s worth of customer data
Terabyte 12 Colleague described their UBI database as 7 terabytes -- Library of Congress is 235
Petabyte 15 Sensor readouts for oil exploration companies
Exabyte 18 5 of these would equal the number of “seconds since the Big Bang”
Avogadrobyte*
*term I invented 24
2 times the number of molecules in 22.41 L of any gas
Trends Affecting Big Data
Shrinking: Cost to process and store data
• Process - Moore’s Law • Storage - 2010
Growing: Data
12
Trends Affecting Big Data
Shrinking: Cost to process and store data
• Process - Moore’s Law • Storage - 2010
Growing: Data
• NewAge.com (Q4 2011) 1.2b social media users over the age of 15 are on for long periods every day
• 800 exabytes in 2009 (60 times storage) • Annual growth rate of 45%
Data
• Growing array of sensors that connect machine-to-machine create 30% more data each year
9Satellite imagery
9Telematics
9Radio Frequency Identifiers (RFIDs)
9Event Date Recorders (EDRs)
9Electronic Health Records (EHRs)
14
Data
• Growing array of sensors that connect machine-to-machine create 30% more data each year
9Satellite imagery
9Telematics
9Radio Frequency Identifiers (RFIDs)
9Event Date Recorders (EDRs)
9Electronic Health Records (EHRs) • Not all digital – text, video, speech
• Unclear how to reconcile 30% and 45% annual growth estimates for data. Both represent an avalanche.
• As the data set grows organization and flexible tools become more critical.
Data
• Growing array of sensors that connect machine-to-machine create 30% more data each year
9Satellite imagery
9Telematics
9Radio Frequency Identifiers (RFIDs)
9Event Date Recorders (EDRs)
9Electronic Health Records (EHRs)
16
Data
• Growing array of sensors that connect machine-to-machine create 30% more data each year
9Satellite imagery
9Telematics
9Radio Frequency Identifiers (RFIDs)
9Event Date Recorders (EDRs)
9Electronic Health Records (EHRs)
• Not all digital – text, video, speech
• Unclear how to reconcile 30% and 45% annual growth estimates for data. Both represent an avalanche
• As the data set grows organization and flexible tools
become more critical
Data
• Growing array of sensors that connect machine-to-machine create 30% more data each year
9Satellite imagery
9Telematics
9Radio Frequency Identifiers (RFIDs)
9Event Date Recorders (EDRs)
9Electronic Health Records (EHRs)
18
Data
• Prioritizing what to
collect is a key skill
• Decisions to retain
or discard should
be periodically
Data
• Prioritizing what to
collect is a key skill
• Decisions to retain
or discard should
be periodically
20
Mae West
“Too much of a good
thing…”
is wonderful
22
Mae West
“Too much of a good thing
is wonderful.”
Herbert Simon
“A wealth of information
creates a poverty of attention
and a need to allocate that
attention efficiently among
the overabundance of
information sources that
might consume it.”
24
Big Data’s Effect on the Organization
Functions and Their Roles • Information Technologists
• Data Management
• Actuaries, Statisticians and Product Developers
Big Data’s Effect on the Organization
Functions and Their Roles
• Information Technologists
• Data Management
• Actuaries, Statisticians and Product Developers
26
Big Data’s Effect on the Organization
Functions and Their Roles
• Information Technologists • Data Management
• Actuaries, Statisticians and Product Developers
Big Data’s Effect on the Organization
Functions and Their Roles
• Information Technologists • Data Management
• Actuaries, Statisticians and Product Developers
28
Key Tasks
Identify:
• Data to collect
• Technological and analytic capabilities necessary to access/analyze data.
• Management processes to evaluate and use the results.
Key Tasks
Identify:
• Data to collect
• Technological and analytic capabilities necessary to
access/analyze data
• Management processes to evaluate and use the results.
30
Key Tasks
Identify:
• Data to collect
• Technological and analytic capabilities necessary to access/analyze data
• Management processes to evaluate and use the
Challenges
• Lead time for the collection of data. Process to identify what needs to be collected including data from 3rd parties
• Data base organization.
32
Challenges
• Lead time for the collection of data. Process to identify what needs to be collected including
data from 3rd parties
• Data base organization.
Metadata – easy to access the data for an analysis
Challenges
• Lead time for the collection of data. Process to identify what needs to be collected including
data from 3rd parties
• Data base organization.
34
Big Data IT Requirements
• Large database storagecapabilities 9Big Table 9Cassandra 9Cloud Computing • Distributed processing strategies 9Hadoop 9Stream Processing
Big Data IT Requirements
• Large database storage capabilities
9Big Table
9Cassandra
9Cloud Computing • Distributed processing
36
Big Data Technologies
McKinsey Study June 2011
Big Table Business Intelligence Cassandra Cloud Computing Datamart Data Warehouse Distributed System Dynamo
Extract, Transform and Load Google File System
Hadoop HBase MapReduce Mashup Metadata Non-relational Database R Relational Database Semi-structured Data SQL Stream Processing Structured Data Unstructured Data Visualization
Big Data Analytical Techniques
McKinsey Study June 2011
A/B Testing
Association Rule Learning Classification
Crowdsourcing
Data Fusion and Data Integration Data Mining Ensemble Learning Optimization Pattern Recognition Predictive Modeling Regression Sentiment Analysis Signal Processing Spatial Analysis
38
Management
• Set priorities for data and analysis
• Supervise main actors
• Approve changes supported by analyses • Involvement is key
9 Fast turnarounds and willingness to work with data.
9 Recognize that views may change as additional data emerges.
9 Slick is not always a positive attribute.
Management
• Set priorities for data and analysis
• Supervise main actors
• Approve changes supported by analyses • Involvement is key
9 Fast turnarounds and willingness to work with data.
40
Management
• Set priorities for data and analysis • Supervise main actors
• Approve changes supported by analyses
• Involvement is key
9 Fast turnarounds and willingness to work with data.
9 Recognize that views may change as additional data emerges.
9 Slick is not always a positive attribute.
Management
• Set priorities for data and analysis • Supervise main actors
• Approve changes supported by analyses
• Involvement is key
9 Fast turnarounds and willingness to work with data 9 Recognize that views may change as additional data
42
Actuaries
• Identify the data
elements to collect and reevaluate these
decisions at regular intervals
Actuaries
• Identify the data elements to collect and reevaluate these decisions at regular intervals
• Integrate Big Data into rate making and reserving
cross-44
Actuaries
• Identify the data elements to collect and reevaluate these decisions at regular intervals
• Integrate Big Data into rate making and reserving
• Participate as technical experts on company cross-functional teams for process improvement
• Help decision makers quantify their decisions. For example, the asset value of renewals.
Actuaries
• Identify the data elements to collect and reevaluate these decisions at regular intervals
• Integrate Big Data into rate making and reserving
cross-46
Big Data Impact on Other Industries
• Airlines
• Oil Exploration
• Retailers
Airlines
– mature applicationsSouthwest and Continental
Goal is to maximize ticket revenue while minimizing overbooking. Lots of 3rd party tools to choose from:
• Teradata Airline Decisions • PROS and PROS Cargo
Teradata’s features:
9 Leverages your Integrated Passenger Name Record (iPNR) data warehouse with sophisticated analytics on overbooking and passenger no-shows
9 Highlights the impact of customer behavior on demand and show-rate trends hidden in PNR details
48
Airlines
– mature applicationsSouthwest and Continental
Goal is to maximize ticket revenue while minimizing overbooking. Lots of 3rd party
tools to choose from:
• Teradata Airline Decisions
• PROS and PROS Cargo
Teradata’s features:
9 Leverages your Integrated Passenger Name Record (iPNR) data warehouse with sophisticated analytics on overbooking and passenger no-shows
9 Highlights the impact of customer behavior on demand and show-rate trends hidden in PNR details
9 Provides detailed descriptions of each metric and report layout, including the value and audience for the report
PROS was a pioneer in the field:
9 Aligning your RM (Revenue Management) department with other departments
9 Forecasting fluctuations in passenger demand
9 Managing revenue in markets without fences or price restrictions
9 Differentiating customer value in real-time
9 Accepting group requests without displacing high-value demand
Airlines
– mature applicationsSouthwest and Continental
Goal is to maximize ticket revenue while minimizing overbooking. Lots of 3rd party
tools to choose from:
• Teradata Airline Decisions
• PROS and PROS Cargo
Teradata’s features:
9 Leverages your Integrated Passenger Name Record (iPNR) data warehouse with sophisticated analytics on overbooking and passenger no-shows
9 Highlights the impact of customer behavior on demand and show-rate trends hidden in PNR details
50
Oil Exploration
– enormous dataChevron reported:
9 Industry wide estimates of 8 percent higher production rates and
9 6 percent higher recovery from a "fully optimized" digital oil field.
9 Internal IT traffic of 1.5 terabytes a day
Seismic exploration develops huge datasets:
• Petabyte
• Elastic wave propagation is measured through
• Land based acoustic receivers
• Marine based hydrophones Operational Control Systems:
• Distributed sensors and • High-speed communications
• Explored with data-mining techniques to monitor and fine-tune remote drilling operations.
The aim is to use real-time data to make better decisions and predict glitches
Oil Exploration
– enormous dataChevron reported:
9 Industry wide estimates of 8 percent higher production rates and
9 6 percent higher recovery from a "fully optimized" digital oil field.
9 Internal IT traffic of 1.5 terabytes a day
Operational Control Systems:
• Distributed sensors and • High-speed communications
• Explored with data-mining techniques to monitor and fine-tune remote drilling operations.
The aim is to use real-time data to make better decisions and predict glitches
52
Oil Exploration
– enormous dataChevron reported:
9 Industry wide estimates of 8 percent higher production rates and
9 6 percent higher recovery from a "fully optimized" digital oil field.
9 Internal IT traffic of 1.5 terabytes a day Seismic exploration develops huge datasets:
• Petabyte
• Elastic wave propagation is measured through • Land based acoustic receivers
• Marine based hydrophones Operational Control Systems:
• Distributed sensors and
• High-speed communications
• Explored with data-mining techniques to monitor and fine-tune remote drilling operations. The aim is to use real-time data to make better decisions and predict glitches
Retailers
– emphasis on customersAmazon
Largest online retailer in the US.
Collects huge volumes of customer choice data elements
9 Products researched
9 Products purchased
Use this data to develop their recommendation engine
9 “You might be interested in…” recommendations
54 We Have Spent Our Work Lives Knee Deep in Data – What’s Really New Here?
• New data sources and dramatically increased volumes
• Allow the use of innovative analytical techniques
• Applicable to a wider array of issues and processes
• Actuarial experience with data and analysis is a key resource to every insurance
company
Each of us knows someone like M. Jourdain in the Middleclass Aristocrat (Bourgeois Gentilhomme) who earns our scorn for his smugness at the discovery that he has “spoken prose all of his life.”
We Have Spent Our Work Lives Knee Deep in Data – What’s Really New Here?
• New data sources and dramatically increased volumes
• Allow the use of innovative analytical techniques
• Applicable to a wider array of issues and processes
56 We Have Spent Our Work Lives Knee Deep in Data – What’s Really New Here?
• New data sources and dramatically increased volumes
• Allow the use of innovative analytical techniques
• Applicable to a wider array of issues and processes
• Actuarial experience with data and analysis is a key resource to every insurance
company
Each of us knows someone like M. Jourdain in the Middleclass Aristocrat (Bourgeois Gentilhomme) who earns our scorn for his smugness at the discovery that he has “spoken prose all of his life.”
We Have Spent Our Work Lives Knee Deep in Data – What’s Really New Here?
• New data sources and dramatically increased volumes
• Allow the use of innovative analytical techniques
• Applicable to a wider array of issues and processes
58 We Have Spent Our Work Lives Knee Deep in Data – What’s Really New Here?
• New data sources and dramatically increased volumes
• Allow the use of innovative analytical techniques
• Applicable to a wider array of issues and processes
• Actuarial experience with data and analysis is a key resource to every insurance
company
Each of us knows someone like M. Jourdain in the Middleclass Aristocrat (Bourgeois Gentilhomme) who earns our scorn for his smugness at the discovery that he has “spoken prose all of his life.”
Special Characteristics of Insurance
60
Value of Renewal Book
Combined Ratio of 94
20% New Business at 120 CR 80% Renewal at 88 CR
U/W Asset Value (in Written Premium) of Renewals
80% of Premium is Renewal 12 point U/W Margin
2 Year Average Additional Life
– 19.2% of EP
– Each additional month of life adds .83%, e.g. 3 years 28.8%
Value of Renewal Book
Combined Ratio of 94
20% New Business at 120 CR 80% Renewal at 88 CR
U/W Asset Value (in Written Premium) of Renewals
80% of Premium is Renewal 12 point U/W Margin
62
Special Characteristics of Insurance
• Legacy Book is most valuable asset
• Like all relationship businesses
client
retention and cross-selling is key
Need for Customer Lifetime Value
• Individual client relationships
represent dramatically different values to the company. Value range is more than an order of magnitude from
lowest to highest
• Decompose renewal value and
assign it at the individual customer relationship level
64
Need for Customer Lifetime Value
• Individual client relationships
represent dramatically different values to the company. Value range is more than an order of magnitude from
lowest to highest
• Decompose renewal value and
assign it at the individual customer relationship level
9PLE or Policy Life Expectancy
9Include the value from all insurance relationships
• Critical element for the objective function on many projects
Need for Customer Lifetime Value
• Individual client relationships
represent dramatically different values to the company. Value range is more than an order of magnitude from
lowest to highest
• Decompose renewal value and
assign it at the individual customer relationship level
66
Special Characteristics of Insurance
• Legacy Book is most valuable asset
• Like all relationship businesses client retention and cross-selling is key.
• Regulatory realities
– Limit intervals between changes – More disclosure
– Identical risks get the same rate, no AB experiments
Usage Based Insurance
Big Data Example in our Industry
Auto Market ($166 B Personal; $24 B Commercial)
– 187 million personal vehicles
– Around 10% of vehicles because of age or other factors don’t support telematics.
– 90% of US vehicle population is 168 million.
– Potential market is 29% of 168 million or 49 million vehicles UBI offered on “opt-in” basis. Market Research and
68
Usage Based Insurance
Big Data Example in our Industry
Auto Market ($166 B Personal; $24 B Commercial)
– 187 million personal vehicles
– Around 10% of vehicles because of age or other factors don’t support telematics.
– 90% of US vehicle population is 168 million.
– Potential market is 29% of 168 million or 49 million vehicles
UBI offered on “opt-in” basis. Market Research and Progressive experience
– 40% “interested in UBI” – 75% enroll in Snapshot
70
UBI – Tripsense
Progressive Tripsense 2005
Safe Driver Neutral Unsafe Discount Before Mileage
Adjustment 20% 15% 10%
Discount After Mileage Adjustment
0 to 2,500 18.4% 13.4% 8.4%
2,501 to 5,000 15.1% 10.1% 5.1%
5,001 to 7,500 11.8% 6.8% 1.8%
72
UBI Data Records
Once Per Second• trip_id • update_time • latitude • longitude • altitude • gps_speed • obd_speed • engine_speed • event_code • coolant_temp • trip_odometer
Four Times Per Second
• accel_lat • accel_lon • accel_vert
UBI Market Composition
Program Segments:
1. Reusable Device with Limited Service Capabilities
2. Installed Device Provides Driver Feedback and Value Added Services
3. Specialized Programs For Unique High Service Markets
74
Value of UBI
Simplified Illustration
Benefit: 20% improvement in pure premium
Costs
Average Discount: 12.5%
Average Cost for Technology and Communication Costs: 3.5%
Value of UBI
Simplified Illustration
Benefit: 20% improvement in pure premium
Costs
Average Discount: 12.5%
Average Cost for Technology and Communication Costs: 3.5%
76
Value of UBI
Simplified Illustration
Benefit: 20% improvement in pure premium
Costs
Average Discount: 12.5%
Average Cost for Technology and
Communication Costs: 3.5%
Impact of UBI on Conversion
Elasticity example relevant to UBI
– Assume elasticity of 2, before intro of UBI program
– 15% Discount should result in a 30% increase in conversion
– Prior to the introduction of the discount quotes converted at a 20% rate
– The 30% increase in conversion will increase the conversion rate to 26%
78
Elasticity Depends on Expected
Competitiveness
Example:
– Applicant considers a quote from Company A that includes UBI
– Rate, before UBI, is 5% less than the applicant’s renewal offer. – Once the UBI data is collected it will likely result in an additional
10% saving.
Key Question
Is the conversion rate improvement like we would expect to observe at 5% or 15%?
Elasticity Depends on Expected
Competitiveness
Example:
– Applicant considers a quote from Company A that includes UBI
– Rate, before UBI, is 5% less than the applicant’s renewal offer
– Once the UBI data is collected it will likely result in an additional 10% saving
80
Elasticity Depends on Expected
Competitiveness
Example:
– Applicant considers a quote from Company A that includes UBI. – Rate, before UBI, is 5% less than the applicant’s renewal offer.
– Once the UBI data is collected it will likely result in an additional 10% saving.
Key Question
Is the conversion rate improvement like we would expect to observe at 5% or 15%?
Elasticity Depends on Expected
Competitiveness
Example:
– Applicant considers a quote from Company A that includes UBI. – Rate, before UBI, is 5% less than the applicant’s renewal offer. – Once the UBI data is collected it will likely result in an additional
82
Big Data
• Numerous insurance applications
• Actuaries
– Increase their impact
– Skills and experience to play a leadership
role in insurance
Big Data
• Numerous insurance applications
• Actuaries
– Increase their impact
– Skills and experience to play a leadership
role in insurance
84
Ed Combs
Ed Combs is currently the principal at G. Edward Combs Consulting, LLC.
Ed has had over 30 years experience in the Personal Lines Insurance Industry. He has consulted for many of the major Property and Casualty insurers and has been an executive at both Progressive and 21st Century Insurance.
His certifications include:
CPCU - Chartered Property and Casualty Underwriter, and ARM - Associate in Risk Management.
He is a former board member of the Insurance Institute for Highway Safety, the Highway Loss Data Institute and the Advocates for Highway and Auto Safety.