• No results found

Source electronic health databases 3.1 Summary

3.2 The Clinical Practice Research Datalink

Across the UK there are numerous primary care databases which bring together electronic patient records; however, most of these cover small geographical areas, or small numbers of general practices. The Clinical Practice Research Datalink (CPRD, formerly GPRD) is one of three clinical research databases which provide patient data from across practices in the UK, allowing for research to be undertaken on samples generalisable to the whole population. The other two databases like the CPRD are the Health Improvement Network Database (THIN) and the QRESEARCH database.

The CPRD was initially set up in 1987 as a commercial databank by the company VAMP (Value Added Medical Products). Now run by the Medicines and Healthcare products Regulatory Agency (MHRA), the CPRD is the largest primary care database in the UK, covering just over 8% of the UK population.(73–75)

The UK has the advantage of near-universal registration with general practitioners, around 98% of the entire population. As such, analyses of the registered patient population are widely representative of the UK population, though notable exceptions include asylum seekers, the homeless, prison populations and those in the armed services, who are less likely to access

39 GP services.(76–78) Additional linkages to secondary care data, disease registries, surveys and vital statistics give these databases unique value for observational studies and increasingly for pragmatic clinical trials.(79–82)

The CPRD currently contains longitudinal primary care records for approximately 13.5 million patients, of whom 5.5 million are currently active. Continuous observational data have been collected in most practices for over six years, yielding over 30 million patient years of observation.78 Patients contributing to the CPRD have been shown to be representative of the UK population in terms of age and gender, though in terms of regional representation the north of England is slightly under-represented.(74) Importantly, the validity of a wide range of diagnostic and clinical measures has been established, with a 2010 systematic review demonstrating a mean positive predictive value of 88% across a range of 183 diagnoses.(83– 86) The distribution of general practices contributing to the CPRD compared with the distribution of all general practices in the UK in July 2012 is shown in table 3.1.

Table 3.1 Regional distribution of practices contributing to the July 2012 CPRD compared with the UK distribution

Region CPRD July 2012 % UK April 2012 %

England 483 77% 8123 82%

Scotland 69 11% 998 10%

Wales 50 8% 474 5%

NI 22 4% 354 4%

Total 624 100% 9949 100%

All patients contributing to the CPRD are registered with 624 practices which all use the Vision clinical software system. Vision is one of several clinical software systems recommended for use in primary care by the GP Systems of Choice (GPSoC) Initiative, which supplies information technology systems to general practices across the UK. Other software systems include Egton Medical Information Systems (EMIS) and TPP System One, amongst others.(87)

40 General practitioners and practice staff record data onto their clinical systems and send anonymised patient data every 6 weeks to the CPRD. These data are then appended to the continually growing database, which contains information on diagnoses, symptoms, referrals, test results, medications, consultations, demographics, and lifestyle factors. Fifty per cent of English practices contributing to the CPRD also allow linkage to other data sources, such as the Hospital Episode Statistics for England and the Office for National Statistics (ONS) Mortality Data.

Quality of research data is audited at both the patient and practice level by the CPRD team. Individual patient data are defined as being of ‘Acceptable Research Quality’ (ARQ) if they are free of gaps or inconsistencies which cast doubt on the accuracy of the data recorded. Practices are required to record a minimum of 95% of prescribing and relevant patient encounter events. Data from practices are routinely validated by internal checks. Practice-level data are defined as being Up to Standard’ (UTS) if it conforms to set of 10 metrics, including having a high proportion of patients with ARQ data, and having rates of prescriptions, deaths, pregnancies and referrals comparable to other practices. The first practice to meet these quality criteria did so in 1987, with most other general practices reaching the same level of quality by 1991.

For research purposes, individual patient data is anonymised, with identifying information such as NHS number, name, date of birth, address and postcode removed. Information such as gender and year of birth are retained in order to conduct stratified analyses. In addition to these demographic data, researchers can access coded data pertaining to diagnoses, symptoms and processes of care. Free text entered by the primary care team are not routinely available to researchers, as these may contain identifiable information. Coded data are entered according to the Read clinical coding system, a hierarchical system of medical coding used across UK primary care.(88,89)

41 The CPRD is organised into 10 file types, each of which contains a subset of the patient record. For research purposes, information from these files can be joined using the anonymised patient or practice identifier. The file types are described in table 3.2.

Table 3.2 Description of Clinical Practice Research Datalink file types

File type Contents Example fields

Patient Basic demographics and

registration details

Anonymised identifier, year of birth, registration date, transfer out date, death date

Practice Details for all participating practices Practice identifier, geographical region, date of

becoming “up to standard”, date of last data collection

Staff Practice staff details Staff identifier, gender, role

Consultation Information about consultation type

as entered by the GP

Consultation identifier, consultation type, consultation date, staff identifier, consultation duration

Clinical All medical history including

symptoms signs and diagnoses

Date of clinical event, date of data entry, clinical code, episode type, additional details identifier

Additional Details relating to events coded in

the clinical file

Patient identifier, entity type, data fields (depends on entity type)

Referral Information about referrals to

external care centres

Referral method, referral specialty, referral type, attendance type, referral urgency

Immunisation Details of immunisation records Immunisation reason, type, stage, status, compound

used, location, reason for immunisation, route of administration

Test Test results linked to events coded

in the clinical file

Type of test, result, normal range for result, unit of measure

Therapy All prescriptions issued by the GP CPRD product code, British National Formulary

Code, product name, dosage, quantity, pack size, number of days prescribed