• No results found

Chapter 4: Scoping the potential of using data-driven segmentation analysis in healthcare - a

5.3 Data manipulation: Demographic data

While some attributes could be obtained directly from the raw dataset, like gender, the majority of patient-level variables had to be constructed from care episode data. Based on existing literature, a large number of variables were created. Afterwards, they were further reviewed in the data cleaning, reduction and normalisation stages.

Basic person data that can be extracted from administrative databases are age and gender.

Age was recorded as at the end of the study period, as calculated from the year of birth. For this reason, the youngest age in the dataset was 5, reflecting new-borns included at the beginning of the five-year study period. Townsend 2001 deprivation score was also included in the demographic data, at a 5, 10 and 20-step scale. While the Index of Multiple Deprivation (IMD) score is also available linked to CPRD, Townsend was chosen because it does not include any predictors related to health.107 The IMD does include a health predictor - mortality - and could therefore overestimate the impact of deprivation on health outcomes.

Per person:

Cost

Inpatient

Person Activity

• Long-term condition flags • # ELDCs

• # ELIPs

• # GP surgery attendances

• # GP clinic attendances

• # GP telephone contacts

• # GP home visits

• Total cost GP surgery

• Total cost GP clinic

• Total cost GP telephone

• Total cost GP home visits

Therapy • # prescriptions

• # unique prescriptions

• Total prescription cost

Outpatient • # OP appointments

• # unique OP specialties • Total cost OP

• Total cost

ELDC: elective day case; ELIP: elective inpatient; NEIP: non-elective inpatient; RA: Regular attender; ALoS: average length of stay; OP:

outpatient; GP: general practice/practitioner

• # long term conditions

• Multimorbidity flag

• Risk score

• Residential care flag Consultation

Clinical • Long-term condition flags Database

62 Secondly, the presence of selected mental health diagnoses was used to identify mental health conditions. Finally, learning disabilities were identified. Together these form the long-term condition (LTCs) flags.

5.3.1 Chronic conditions

This study considers the impact of chronic conditions on the healthcare needs of the patient.

To define a set of chronic conditions that significantly impact care needs, a comorbidity index was used. Comorbidity indices are based on a list of coexisting illnesses that may impact a patient’s prognosis. By combining the impact of these conditions in a single index, an overall score can be generated.108 These scores can be used to correct for case mix differences or to predict outcomes like mortality.109 There exist a large number of comorbidity indices, all based on different conditions and some providing weightings for specific diseases based on severity.

However, not all indices can be derived from administrative data as used in this study, instead requiring case note review to determine severity or non-diagnosis based factors.

HES uses the International Classification of Diseases (ICD) version 10. The Charlson Index has been proven to work using ICD-based diagnosis information from administrative data, and is one of the most widely used comorbidity indices.109-112 Moreover, while it is originally an index predicting mortality,113 it has been shown to also correlate with avoidable hospital admissions,110 health-related quality of life,114 and healthcare cost.115 The Charlson Index combines 16 conditions and assigns them a numerical standard weighting to create the overall score.113

This study used the individual conditions specified by the Charlson Index as variables rather than their combined score, to enable the exploration of different patterns at the condition level. There exist different versions of ICD-10 translations of the original ICD-9 codes that were developed for this index (see Table 11).110, 112, 116-119 This research used the translation developed by Aylin and Bottle119, 120 because it has been adapted to English coding practices, and because the HSCIC includes it in statistical guidance to NHS institutions.118 In addition, a translation to Read codes has also been created specifically for use in primary care datasets like CPRD, which use Read codes rather than ICD-10.121

For the purpose of this study, the condition rather than its state (e.g. diabetes, versus diabetes with complications) was used, as patient characteristics should be the same over time. Therefore ‘diabetes’ and ‘diabetes with complications’ were combined, as well as ‘mild liver disease’ and ‘severe liver disease’, and ‘cancer’ and ‘metastatic cancer’.

Table 11: Overview of Charlson comorbidity ICD coding

Deyo et al.111 ICD-9 adaptation Sundararajan et al.112 ICD-10 translation

Bottle and Aylin120 English coding adaptation

Condition ICD-9 codes ICD-10-AM codes ICD-10 codes

Myocardial infarct 410-410.9 Acute myocardial infarction;

412 Old myocardial infraction

Acute myocardial infarction:

I21, I22, I252

I21, I22, I252, I258 Congestive heart

failure 428-428.9 Heart failure I50 I50

Peripheral vascular

disease 443.9 Peripheral vascular disease inc.

intermittent claudication; 441-441.9 Aortic aneurysm; 785.4 Gangrene; V43.4 Blood vessel replaced by prosthesis; Procedure 38.48 Resection and replacement of lower limb arteries

I71, I739, I790, R02, Z958,

Z959 I71, I739, I790, R02, Z958,

Z959

Cerebrovascular

disease 430-438 Cerebrovascular disease G450-G452, G454, G458, G459, G46, I60-I66, I670-I672, I674-I679, I681, I682, I688, I69

G450-G452, G454, G458, G459, G46, I60-I69 Dementia 290-290.9 Senile and presenile dementia F00-F02, F051 F00-F03, F051 Chronic pulmonary

disease 490-496 Chronic obstructive pulmonary disease; 500-505 Pneumoconioses; 506.4 Chronic respiratory conditions due to fumes and vapours

Pulmonary disease: J40-J42,

J44-J47, J60-J67 J40-J47, J60-J67

Connective tissue disease

Rheumatologic disease: 710.0 Systematic lupus erythematosus; 710.1 Systematic sclerosis; 710.4 Polymyositis;

714.0-714.2 Adult rheumatoid arthritis;

714.81 Rheumatoid lung; 725 Polymyalgia rheumatica

Connective tissue disorder:

M050-M053, M058-M060, M063, M069, M32, M332, M34, M353

M05, M060, M063, M069, M32, M332, M34, M353

Ulcer disease Peptic ulcer disease: 531-534.9 Gastric, duodenal and gastrojejunal ulcers

Peptic ulcer: K25-K28 K25-K28 Mild liver disease 571.2 Alcoholic cirrhosis; 571.5 Cirrhosis

without mention of alcohol; 571.6 Biliary cirrhosis; 571.4-571.49 Chronic hepatitis

Liver disease: K702, K703,

K717, K73, K740, K742-K746 K702, K703, K717, K73, K74

Diabetes 250-250.3 Diabetes with or without acute metabolic disturbances; 250.7 Diabetes with peripheral circulatory disorders

E101, E105, E109, E111, E115, E119, E131, E135, E139, E141, E145, E149

E101, E105, E106, E108, E109, E111, E115, E116, E118, E119, E131, E135, E136, E138, E139, E141, E145, E146, E148, E149 Hemiplegia Hemiplegia or paraplegia: 344.1

Paraplegia; 342-342.9 Hemiplegia

Renal failure: 582-582.9 Chronic glomerulonephritis; 583-583.7 Nephritis and nephropathy; 585 Chronic renal failure; 586 Renal failure, unspecified;

588-588.9 Disorders resulting from impaired renal failure

Renal disease: N01, N03, N052-N056, N072-N074, N18, N19, N25

I12, I13, N01, N03, N052-N056, N072-N074, N18, N19, N25

Diabetes with end organ damage

Diabetes with chronic complications:

250.4-250.6 Diabetes with renal, ophthalmic, or neurological manifestation

Diabetes complications: Any tumour Any malignancy, including leukaemia

and lymphoma: 140-172.9 Malignant neoplasm; 174-195.8 Malignant neoplasm; 200-208.9 Leukaemia and lymphoma

Cancer: C0-C3, C40, C41, C43, C45-C49, C5, C6, C70-C76, C80-C85, C883, C887, C889-C901, C91-C93, C940-C943, C9451, C947, C95, C96

C00-C67, C80-C97 Leukaemia

Lymphoma Moderate or severe

liver disease 572.2-572.8 Hepatic coma, portal hypertension, other sequalae of chronic liver disease; 456.0-456.21 Esophageal varices

Severe liver disease: K721,

K729, K766, K767 K721, K729, K766, K767

Metastatic solid

tumour 196-199.1 Secondary malignant neoplasm of lymph nodes and other organs

Metastatic cancer: C77-C80 C77-C79 AIDS 042-044.9 HIV infection with related

specified conditions HIV: B20-B24 B20-B24

64 5.3.2 Mental health

While mental health conditions have not been included in the Charlson index, they have a significant impact on a patient's care needs, such as higher overall utilisation of care,122 more unplanned and potentially preventable hospital admissions,123 and more readmissions.124 Moreover, patients with mental health conditions require a care model which integrates with mental health services.125 While this study did not have access to linked mental healthcare provider data, the physical healthcare needs of these patients were explored by creating a mental illness flag, similar to the chronic condition flags.

There exists no standard classification for severe mental illness, however most studies include psychosis (including or limited to schizophrenia and schizoaffective disorders) and bipolar disorder (see Table 12). This research used the codes defined by White et al.,126 as it provides a wide definition for psychosis and bipolar disorders that includes the criteria set out by Chang et al.,127, while excluding drug-induced and depression-related psychosis that NHS England includes.128 The latter two may be temporary states rather than enduring mental illnesses and were therefore excluded.

Table 12: Overview of severe mental illness definitions

Source Definition Conditions and ICD-10 codes White et al.126 Severe mental illness are

“a range of serious and chronic conditions including

127 “SMI [Severe mental

illness] which might include schizophrenia, Bipolar affective disorder: F31 Substance use disorder: F10-F19 Depressive episode: F32

Recurrent depressive disorder: F33 NHS

England128 Severe mental illness are

“patients with psychoses, including schizophrenia and bipolar affective disorder”

Psychosis: F20-F29

Drug induced psychosis: F105, F115, F125, F135, F145, F155, F165, F195

Bipolar disorder: F302, F312, F315 Depressive episodes (with

psychosis): F323 and F333 HSCIC129 Mental health prevalence

based on “people with schizophrenia, bipolar disorder and other psychoses”

N/A

5.3.3 Learning disabilities

Learning disabilities is another group of conditions that significantly impacts a person’s care needs, but that is not included in general morbidity indices. Like mental health, it is included in the NHS Quality and Outcomes framework, however there are no specific conditions listed for this metric.129 While in England the term learning disabilities is common, the World Health

Organisation uses “mental retardation”.130 This group of conditions is covered by ICD-10 codes F70-F79,131 and was included as such in this research.

5.3.4 Creating the LTC flags

To create the chronic condition, mental health and learning disabilities flags in the acute dataset, all diagnosis fields in all inpatient hospital episodes during the study period were reviewed for the relevant ICD-10 codes. In primary care, where Read codes are used, the ICD codes for mental health and learning disabilities were mapped to Read codes according to the ICD-10/Read Cross Mapping (Version 3) created by the UK Terminology Centre.132 CPRD uses Medcodes rather than Read codes in its databases, so the Read codes for the various conditions were translated to Medcodes using the CPRD Medical Dictionary.

Some conditions can also be derived from fields other than diagnosis, for example diabetes from recorded HbA1c test scores. However, this is not true for all conditions, and therefore only diagnoses codes were used to avoid bias. If there was any diagnosis of the condition over the study period, in any dataset, the patient was given a flag for the condition.

In addition to the various LTC flags, the database also included a metric specifying whether the patient had multiple LTCs and a count of LTCs, both based on the conditions specified above.

5.3.5 Missing data

For both acute and primary care, identifying chronic conditions will be subject to missing data.

Doctors are more likely to record the chronic conditions they treat, rather than describing a patient’s full health status. However, it is likely that if a condition is related to any healthcare need, it will have been recorded in either dataset. If a patient technically has a chronic condition but never requires care, this does not affect the healthcare system and there will be little impact from not recording the condition.