3.3 Development of the study cohort
3.3.2 Database development
3.3.2.1 Inclusion criteria
Data were requested from the MHRA using the following inclusion criteria:
1. Patients with a recorded first prescription of one of the nine following drug classes in general practice between 1 January 2000 and 31 December 2003 (first prescription was defined as no prior prescription of any drug from the nine drug classes within the preceding year);
I. Angiotensin-converting enzyme (ACE) inhibitors II. Angiotensin-II (AT-II) receptor antagonists III. Calcium-channel blockers (Ca-channel blockers) IV. Thiazide diuretics
V. Potassium-sparing or loop diuretics
VI. Alpha-adrenoceptor antagonists (alpha-blockers) VII. Beta-adrenoceptor antagonists (beta-blockers) VIII. Aldosterone antagonists/potassium-sparing diuretics IX. Mixed class (e.g. beta-blocker and thiazide diuretic
combination)
2. Patients with a diagnosis of hypertension, as identified by a Read/OXMIS code of hypertension (Appendix 7) or with three blood pressure measurements of greater than 160/100 (the threshold for initiation of antihypertensive
treatment without other patient factors at the time) on or in the 365 days before the date of the first antihypertensive prescription;
46
3. Patients must have at least one year of available data prior to the first prescription of an antihypertensive drug;
4. Patients must be aged 18 years and older on the day of the first prescription.
The day of the first prescription of the antihypertensive drug is referred to as the index date.
Data were obtained from the MHRA as 38 flat text tiles linked by a unique patient identification number. Look-up files were also provided to allow for the linkage between data files (Table 3.1). Not all of the text files were deemed to be relevant to the aims of the study including those relating to asthma, diet, exercise, immunization, passive smoking, residence, and sleeping habits.
Table 3.1 – GPRD data and look-up files obtained from the MHRA File name Data description
Data files
ADR Allergy and intolerance information
Agencies Information about health agency involvement
Alcohol Details relating to alcohol use including the number of units of alcohol per week
Asthma Data recorded via the asthma disease management structured data area
Blood pressure Current and historic blood pressure records
BMI Historic measurements of height, weight, and BMI Clinical All the medical history data entered on the GP practice
system, including systems, signs, and diagnoses (split into 3 files due to size of data set)
Consultation Data relating to the type of consultation as entered by the GP (split into 2 files due to size of data set)
Death administration Death data
Diabetes Data entered via the diabetes disease management structured data area
Diet Data relating to the patient’s diet
Exercise Data relating to the patient’s exercise pattern
Height Patient height data
Historical registration Current and historical registration details
47
File name Data description
Immunization Data on patient immunizations.
Maternity Data entered via the maternity structure data area
Passive smoking Information about whether the patient is exposed to passive smoking
Patient Basic patient demographics and patient registration details Practice Practice registration details
Referral Information involving patient referrals to external care centres such as hospitals
Residence Information about the patient’s residential arrangements Sleeping patterns Data about the patient’s sleeping habits
Smoking Current and historic smoking details
Status Current and historic records for the patient health status Test Data on the type of test (e.g. biochemical investigation) and
results (split into 3 files due to size of data set)
Therapy Data relating to all prescriptions issued by the GP (split into 6 files due to size of data set)
Treatment compliance Data for which level the patient complies with the treatment issued
Weight Weight data
Look-up files
Medical codes Read/OXMIS codes and associated code name
Product codes Product codes for treatments including drug name, dose, and British National Formulary chapter and header
Test codes Test codes and associated test name
Dose conversion table Data that convert the dose provided to a numeric dose (e.g.
one tablet per day converted to 1)
3.3.2.2 Data cleaning
The GPRD is a large and well-validated database, but extreme and implausible values can still exist within the database. When determining the patient baseline covariates described below in section 3.3.3, the recorded values were assessed for plausibility.
Impossible values (such as a weight of 1000 kg) were recoded as an error or excluded.
In determining baseline values, if the implausible value was the value recorded closest to the index date, this record was excluded and the next closest record was used as the baseline value.
48
The test file was examined in order to identify biochemical laboratory tests that were not clinically possible. In consultation with clinicians, decisions were made to exclude implausible laboratory test results. For example, a record of a creatinine serum
concentration of 1800 µmol/l was excluded from the analysis, because such a concentration would not have been clinically possible.
3.3.2.3 Exclusion of pregnant women
Hypertension may be diagnosed during pregnancy as a result of the pregnancy or following pre-existing hypertension. Women who were pregnant during the study period were excluded because of possible differences in the condition, the treatment of the patient, and how the GP monitors the patient for both drug efficacy and drug safety.
An algorithm was developed based on the work suggested by Hardy and colleagues (2004) for identifying pregnant women using Read/OXMIS codes. The algorithm was based on the Read/OXMIS codes that were representative of a pregnancy marker or a pregnancy outcome.
Types of pregnancy markers included:
1. Lab tests and procedures;
2. GP practice visits related to pregnancy;
3. Threatened abortion;
4. Abortion referral;
5. Obstetric hospitalization.
49
Types of pregnancy outcomes included:
1. Elective termination;
2. Fetal death;
3. Hydatiform mole/blighted ovum;
4. Live births;
5. Delivery outcome unclear;
6. Delivery booking;
7. Multi-fetus delivery.
The algorithm developed by Hardy and colleagues (2004) was designed to identify definite pregnancies and to accurately determine dates of conception. My goal was to identify women who possibly could have been pregnant during the study time period and therefore 804 women were excluded from my analysis because they had any one of the following conditions:
1. A pregnancy marker from 01/04/1999 to 31/12/2003;
2. A pregnancy outcome from 01/01/2000 to 30/09/2004;
3. An expected date of delivery in the maternity file from 01/01/2000 to 30/09/2004.
Patients who had a pregnancy marker from 01/01/2004 to 30/09/2004 and no
pregnancy outcome were also examined. The decision was taken to exclude these 339 patients on the basis that they were potentially pregnant during the study time period.
50
Therefore 1143 women, who represented 3.0% of the female population, were excluded from subsequent analyses.