Chapter 1: Introduction and background
2.3.3 Data extraction and quality assessment
Having identified the studies for inclusion in this review, the next step was to extract the relevant data and assess the quality of each of the included studies. To do this, it was necessary to establish the most appropriate data extraction and quality appraisal tools. The Cochrane Handbook for systematic reviews (Higgins & Green, 2011) advises that quality assessment should check for validity and sources of bias. I judged the tools used in a systematic review in adherence to medication in Parkinson’s disease by Daley et al., 2012) to be the most suitable for assessing the quality of studies in this systematic review. Hall et al. (2009) used appraisal tools
developed by Crombie (1996) and Poppay et al. (1998). I chose to use the recently developed tools by Daley et al. (2012) because they focused more on the quality of methodological rigor and the risk of bias.
The full text of the included articles was reviewed and data were extracted using a slightly adapted version of the pre-determined form taken from Daley et al. (2012). The use of a pre-determined form can reduce the likelihood of bias and it helps to ensure the data extraction process is systematic. I adapted the data extraction sheet slightly by including the percentage of participants who were adherent/non-adherent and by including both the factors that affect adherence in both a negative and a positive way (rather than the factors affecting non-adherence only). I also decided to include the country where the study was conducted as this could be used to explain any conflicts in the evidence (for example, the availability of GFF may differ between countries).
The extracted data were tabulated and the emerging adherence factors were grouped together according to the six themes identified in the systematic review by Hall et al. (2009):
1. Sociodemographic factors 2. Knowledge, attitudes and beliefs 3. Illness and symptom factors 4. Treatment factors
80 5. Socio-cultural/Environmental factors
6. Quality of life and psychological well-being
Quality assessment for risk of bias/internal validity
The purpose of the quality assessment is to check for bias and internal validity. Systematic error and bias are terms that are interchangeable with the term ‘internal validity’. Internal validity is the extent to which the design and conduct of a research study are likely to affect the reliability of the results (Higgins & Green, 2011).
Daley et al. (2012) developed a quality appraisal tool to assess the risk of bias in the observational studies included in their systematic review. The assessment tool included an overall summation of the risk of bias for each study (Daley et al., 2012). Although summary scores are not recommended by some reviewers (Higgins & Green, 2011) because the influence (weighting) of each quality item is not equal, I used this because it helped with descriptive clarity in appraising which study’s results were trustworthy or not. The quality checking form used by Daley et al. (2012) was adapted slightly for this study. For example, Daley et al. (2012) specified their diagnostic criterion for Parkinson’s disease, and I specified that CD should be biopsy confirmed.
The studies included in my review were assessed against five potential sources of bias using this adapted version of the quality checking criteria developed by Daley et al. (2012):
1) Selection bias
Diagnosis of CD using an intestinal biopsy is considered to be the gold standard (Leeds et al., 2008). I believe the accuracy of CD diagnosis is an important factor in this review because patients who have been incorrectly diagnosed with CD may behave differently in relation to following a GFD. Papers that included CD patients who were diagnosed by internal biopsy were judged to have a low risk of selection bias in relation to diagnostic accuracy. Where any other method of diagnosis was used the risk of bias in relation to diagnostic inaccuracy was deemed to be high.
Studies that included a population that is representative of the wider population of people with CD was regarded as low risk in relation to participant representativeness. Where participants were recruited from coeliac support groups (which are known to have members that are not representative of the wider population (Butterworth et al.,
2004)), or other non-representative groups, the risk of selection bias was judged to be high.
Studies were judged on whether or not they employed an appropriate sampling method to discount selection bias, such as random sampling. Both the source and the method of sampling were considered in this assessment. Samples that were not likely to be representative of the wider population were given a lower rating than samples that provided the opportunity for a more representative sample to be selected.
2) Random variation/chance
Sample size calculation
Studies that reported a sample size calculation and the target population was reached were judged to be low risk with regards to random variation/chance.
3) Detection bias
Validity of adherence measures
In assessing the validity of adherence measures I considered the methods of measuring adherence to a GFD. Serological testing and
intestinal biopsy were considered to be valid measures, whereas self- reporting, interview or assessment by a healthcare professional (based on the patient’s self-reporting) were considered not to be valid measures of adherence.
Was follow-up the same for cases and controls (where applicable)? Were appropriate measures taken at follow-up?
4) Attrition bias
Loss to follow up
Participants who are lost to follow-up (where applicable) can lead to bias in the results and this can compromise the validity of the results. Participants who drop out of a study may differ in some way to those who continue to participate. Studies that showed a high loss to follow-up were rated as having a high risk of attrition bias.
5) Reporting bias
Appropriateness of analysis
Reporting bias was judged based upon whether the authors used an appropriate method of analysing the data. I also considered whether significance was likely to be a result of chance and whether missing data was dealt with appropriately.