Using Text Analytics to Accurately Segment
Workers’ Compensation Injuries
Presented by: Amir Biagi
Associate, Predictive Modeling March 7, 2012
Insurance is a data driven industry, and only
structured data is usually used for analyses.
Key information captured in text notes is
typically overlooked.
Text analytics can be used to draw insights from
unstructured data to enrich traditional methods.
We will review the case study in three steps.
Results
Background
Results
Background
Solution
A small percentage of injuries are responsible
for the majority of workers’ compensation loss.
Best 20% Worst 20%
1% 3%
6%
14%
77%
Percentage of Loss Costs
Insurers can utilize predictive models to identify
high-cost injuries early-on.
A predictive model can use claimant attributes
to segment injuries based on outcome potential.
A predictive model can use claimant attributes
to segment injuries based on outcome potential.
A predictive model can use claimant attributes
to segment injuries based on outcome potential.
1 2 3
w1(Age) + w2( Gender) + w3 (Injury) + w4(Co-morbidities) + ...
Scoring Equation
A predictive model can use claimant attributes
to segment injuries based on outcome potential.
1 2 3
A predictive model can use claimant attributes
to segment injuries based on outcome potential.
1 2 3
Injured workers’ secondary conditions (i.e.,
co-morbidities) can influence the overall outcome.
Co-morbidities are traditionally captured in claim
notes.
Clmt indicated that next month, he will have large piece of abdomen removed, cosmetic in nature because clmt is large person. Clmt
indicated that he is a slow healer, not diabetic
and does not have high blood pressure. Clmt
indicated he is doing dietary adjustment to try to impact weight. Weight has mushroomed
because of inactivity. Clmt is considering gastric bypass to work with weight also.
There can be multiple notes per injured worker.
Clmt indicated that next month, he will have large piece of abdomen removed, cosmetic in nature because clmt is large person. Clmt
indicated that he is a slow healer, not diabetic
and does not have high blood pressure. Clmt
indicated he is doing dietary adjustment to try to impact weight. Weight has mushroomed
because of inactivity. Clmt is considering gastric bypass to work with weight also.
Clmt indicated that next month, he will have large piece of abdomen removed, cosmetic in nature because clmt is large person. Clmt
indicated that he is a slow healer, not diabetic
and does not have high blood pressure. Clmt
indicated he is doing dietary adjustment to try to impact weight. Weight has mushroomed
because of inactivity. Clmt is considering gastric bypass to work with weight also.
Clmt indicated that next month, he will have large piece of abdomen removed, cosmetic in nature because clmt is large person. Clmt
indicated that he is a slow healer, not diabetic
and does not have high blood pressure. Clmt
indicated he is doing dietary adjustment to try to impact weight. Weight has mushroomed
because of inactivity. Clmt is considering gastric bypass to work with weight also.
Our task is to develop co-morbidity indicators
based on claim notes.
Clmt indicated that next month, he will have large piece of abdomen removed, cosmetic in nature because clmt is large person. Clmt indicated that he is a slow healer, not diabetic
and does not have high blood pressure. Clmt indicated he is doing dietary adjustment to try to impact weight. Weight has mushroomed because of inactivity. Clmt is considering gastric bypass to work with weight also.
Clmt indicated that next month, he will have large piece of abdomen removed, cosmetic in nature because clmt is large person. Clmt indicated that he is a slow healer, not diabetic and does not have high blood pressure. Clmt indicated he is doing dietary adjustment to try to impact weight. Weight has mushroomed because of inactivity. Clmt is considering gastric bypass to work with weight also.
Clmt indicated that next month, he will have large piece of abdomen removed, cosmetic in nature because clmt is large person. Clmt indicated that he is a slow healer, not diabetic and does not have high blood pressure. Clmt indicated he is doing dietary adjustment to try to impact weight. Weight has mushroomed because of inactivity. Clmt is considering gastric bypass to work with weight also.
We will review the case study in three steps.
Results
Background
We will outline the solution in six steps.
1 IdenFfy the co-‐morbidiFes that are relevant.
2 Work with clinicians to define synonyms for each co-‐morbidity.
3 Determine how false posiFves will be handled.
4 Apply text analyFcs algorithm to all notes.
5 Develop note based indicators.
Step 1:
Work with the clinical team to identify
the co-morbidities that are relevant.
Step 1:
Work with the clinical team to identify
the co-morbidities that are relevant.
Step 2:
Work with clinicians to define synonyms
for each co-morbidity.
Step 2:
Work with clinicians to define synonyms
for each co-morbidity.
Diabetes Diabetic Neuropathy Insulin Blood Sugar Mellitus Diabetes Mellitus Glucose Type 2 Diabetes Actos Glucophage Hypoglycemia Iddm Lantus A1C Blood Glucose
Step 3:
Determine how false positives will be
handled.
Clmt indicated that next month, he will have large piece of abdomen removed, cosmetic in nature because clmt is large person. Clmt
indicated that he is a slow healer, not diabetic
and does not have high blood pressure. Clmt indicated he is doing dietary adjustment to try to impact weight. Weight has mushroomed
because of inactivity. Clmt is considering gastric bypass to work with weight also.
Step 4:
Apply text analytics algorithm to all
notes.
Step 5:
Develop note based indicators.
Claim
A
A
A
B
C
C
Note ID
1
2
3
2
1
2
Diabetes
0
1
0
1
0
0
Clmt indicated that next month, he will have large piece of abdomen removed, cosmetic in nature because clmt is large person. Clmt indicated that he is a slow healer, not diabetic
and does not have high blood pressure. Clmt indicated he is doing dietary adjustment to try to impact weight. Weight has mushroomed because of inactivity. Clmt is considering gastric bypass to work with weight also.
Clmt indicated that next month, he will have large piece of abdomen removed, cosmetic in nature because clmt is large person. Clmt indicated that he is a slow healer, not diabetic and does not have high blood pressure. Clmt indicated he is doing dietary adjustment to try to impact weight. Weight has mushroomed because of inactivity. Clmt is considering gastric bypass to work with weight also.
Clmt indicated that next month, he will have large piece of abdomen removed, cosmetic in nature because clmt is large person. Clmt indicated that he is a slow healer, not diabetic and does not have high blood pressure. Clmt indicated he is doing dietary adjustment to try to impact weight. Weight has mushroomed because of inactivity. Clmt is considering gastric bypass to work with weight also.
Step 6:
Develop claim level indicators.
Claim
A
B
C
Diabetes
1
1
0
Clmt indicated that next month, he will have large piece of abdomen removed, cosmetic in nature because clmt is large person. Clmt indicated that he is a slow healer, not diabetic
and does not have high blood pressure. Clmt indicated he is doing dietary adjustment to try to impact weight. Weight has mushroomed because of inactivity. Clmt is considering gastric bypass to work with weight also.
Clmt indicated that next month, he will have large piece of abdomen removed, cosmetic in nature because clmt is large person. Clmt indicated that he is a slow healer, not diabetic and does not have high blood pressure. Clmt indicated he is doing dietary adjustment to try to impact weight. Weight has mushroomed because of inactivity. Clmt is considering gastric bypass to work with weight also.
Clmt indicated that next month, he will have large piece of abdomen removed, cosmetic in nature because clmt is large person. Clmt indicated that he is a slow healer, not diabetic and does not have high blood pressure. Clmt indicated he is doing dietary adjustment to try to impact weight. Weight has mushroomed because of inactivity. Clmt is considering gastric bypass to work with weight also.
We will review the case study in three steps.
Results
Background
The presence of diabetes is strongly related to
overall outcome.
Key Word Reference % Average Cost
No References 96.5% $23,000
False Positives 0.9% $38,000
Diabetes Identified 2.6% $101,000
Average Cost per Claimant
$23,000 No References $38,000 False Positive $101,000 Diabetes Identified