3 Materials and Methods
3.9 Data collection, handling and analysis
3.9.1 Data collection forms
Field data were collected using four different forms, which have been highlighted previously. These include Initial Home Visit form, regular Home Visit form, Clinic Visit form and Household Risk Survey Questionnaire.
i) The Initial Home Visit Form
This form was developed before the start of the study by the project manager and administered immediately after the household consenting. The form captured baseline demographic characteristics at household and individual level. Data collected included household head name, his/her marital status, and highest educational level and number of families in the household. In addition, participant specific details such as date of birth (mainly from birth certificate or national identity card), occupation, education status, relationship to the study infant, as well as living arrangement (live or sleep in same house as the study infant) were collected. The field workers collected these data.
ii) Regular Home and Clinic Visit Forms
The two forms were developed and piloted during the early phase of the study. Following feedback from the participants, field workers and clinician administering the forms the questions were revised. For the Home Visit Form, which was filled twice a-week for every participant, the priority was to make the data collection exercise brief and targeted. Thus the form was restricted to collecting data on specimen (NPS and OF) collection, quick illness assessment for presence of respiratory symptoms (cough, runny nose/blocked nose, difficulty in breathing etc.) or other complaints and recording of vital signs such as temperature and respiratory rates for the under five year olds. Reasons for not collecting specimen or any other complaints were also captured. For ease of data collection each household had a customised form with the list of members already included and the required data on a tabular
format on one side and allowance for entering additional comments on the back of the form. The field workers were required to record either ‘yes or no’ response for most of the questions though in some instances a value record was required. The Clinic Visit form was similar to the Home Visit form except that the illness assessment was elaborate. The form was adopted from earlier studies (Okiro 2007; Munywoki et al. 2011). The Clinician collected the data during the participant attendance at the study clinic. Additional data on anthropometric measures (weight, Mid Upper Arm Circumference (MUAC), height), oxygen saturation (using pulse oximeter), heart rate, any laboratory tests done, diagnosis and
treatment given were also recorded.
iii) Household Risk Survey Questionnaire
The questionnaire was also modified from a previous study in our site (Okiro 2007; Okiro et
al. 2008). Data on potential risk factors of virus transmission and infection in the household
were collected. This included details on who and how the study infant was taken care of in the household, ownership of property and quality of the household head’s house for assessing socio-economic status, presence of a toilet and waste management as well as source of water for domestic use. Individual level characteristics, such as whether in school, smokes, and anthropometric measures such as weight, mid-upper arm circumference (MUAC) and height were recorded.
3.9.2 Data handling and entry
The project manager assisted by the study coordinator reviewed the forms at the field office. Corrective action was taken if any anomalies were identified at this stage mainly involving a revisit of the household. The filled forms were forwarded to KWTRP in Kilifi on daily basis for data entry. All the field and laboratory data were doubled-entered on a Filemaker database specially designed and coded by the project manager, in consultation with the Centre’s
programmers (FileMaker Pro version 9, FileMaker Inc, US). In addition to the strict data checks at entry (e.g. set date formats, decimals allowed and provision of drop down menus etc.), random checks were regularly conducted on the database to ensure the data were accurate and up-to-date. The regular checks involved selecting randomly ~10 data collection forms every week and crosschecking against the entered data. Appendix O shows screen shots of the database. All the source documents were sorted by household identity and chronologically stored in a cabinet accessible only to the study team.
3.9.3 Data cleaning and analysis
The double entered data were exported in comma separated values files and loaded into STATA (version 11.2, STATACORP, College Station, Texas, US) for data cleaning. Original forms were used to resolve any disparities in the two entries. The clean data with the
household, participant, visit, sample level variables and laboratory results were used for subsequent analysis. Table 3.1 in this Chapter shows the definition of terms that formed the guide to structuring of the data in readiness for analyses. The specific analysis plans are presented in the methods section of the subsequent Chapters. All analyses were done in STATA version 11.2 unless otherwise stated.
In this chapter, we analysed data arising from samples collected in a study conducted in Pingilikani Health Centre, an outpatient clinic, where the original objective was to assess the diagnostic performance of NPS relative to NW (Munywoki et al. 2011). In the outpatient study, a total of 299-paired samples were collected, RNA extracted using Qiagen kit and tested using the M-PCR assay. For the current work, we selected 30 archived nasal samples at random and extracted RNA using three different kits (QIAamp Viral RNA Mini (Qiagen) Kit, MagNa pure LC Total nucleic acid (TNA) and MagNa pure LC High performance (HP) kits). This was to assess the sensitivity of the various kits in detection of RSV and other
respiratory viruses. Another set of 112 nasal specimen were also randomly selected and RNA extracted using the HP kit and divided into two aliquots: one was used for uniplex and the other for triplex real time RT-PCR to test for RSV A, RSV B and adenoviruses. This second set of samples was made to assess the effect of multiplexing in detecting of RSV and other respiratory viruses. Specimens were assigned positive for a particular pathogen if the Ct value was <35.0. A sample was considered a true positive if either of the extraction methods was positive and comparisons made using McNemar’s chi-square test. The Binomial Exact method was used to determine 95% confidence intervals (Cl) for the sensitivities (one-sided 97.5% reported if sensitivity was 100%). The mean (95% Cl) of the Ct values by extraction or the screening method was calculated and comparisons made using paired t-test for each virus.