General Data Management and Statistical methods

3 Patients, Materials and Methods

3.2 General Data Management and Statistical methods

3.2.1 Data Management Methods 3.2.1.1 Laboratory database

All culture positive isolates were logged by date, isolate and patient sex and age. These data were logged in laboratory books, and entered manually into a computer spreadsheet for analysis between 2000-2010. In 2010, the Laboratory Information Management System (LIMS) was introduced to the MLW laboratory, replacing the paper system with an electronic sample management system. These electronic data were imported directly from the

laboratory system to the MLW data department in an Excel spreadsheet (Microsoft Office 2007). The data from the manual and electronic databases were merged prior to analysis. Details of the data cleaning process can be found in Chapter 4, Section 4.4.2.

3.2.1.2 Clinical database

All clinical trials and observational studies (published and unpublished) undertaken at the Department of Medicine, College of Medicine, QECH between 1990-to the current time, where adults with ABM were recruited prospectively and data were freely available were identified. Published studies were found using an online literature search of the medical literature database PubMed from the National Institute of Health in the United States. Un- published studies were identified by consultation with the previous and current MLW directors, Professors Malcolm Molyneux and Rob Heyderman, and with consultation of the

108 heads of the Department of Medicine, Professors Ed Zijlstra and Theresa Allain. Free

access for each study database was obtained from the principal investigator of the individual study included. Selected variables were standardised with matching names, and re-coded for unification. Data from these variables were then merged from each individual database into a single database with the help of Philip Gichiru, statistician at LSTM using IBM SPSS version 17 for Windows. The clinical database was cleaned, removing cases where no mortality data at day 10 or day 40 were available. Further cases that did not meet the criteria for the diagnosis of bacterial meningitis (CSF WCC >100 cells/mm3_{in HIV negative, or >5} cells/mm3_{in HIV positive, or proven evidence of ABM). Details of the full inclusion criteria for} this database can be found in Chapter 4, Section 4.4.2. The database was cross-checked for coding errors using basic frequency analysis, and missing data identified. Where possible, the contributing clinician was contacted for details on coding and missing data.

3.2.2 Epidemiology of bacterial meningitis

The laboratory database was cleaned, removing duplicate cases and all cases whose isolate was not consistent with bacterial meningitis. A separate database was kept of all culture negative samples. The database was first interrogated for pathogen seasonality, incidence. All isolates were then divided into age groups, the incidence, trends and frequency of each pathogen over time per age group were calculated. Detailed methods including data cleaning can be found in Chapter 4, Section 4.2.2. All analysis was undertaken using Stata version 10 (Statcorp USA), tables and charts were generated using Microsoft Excel.

3.2.3 Predictors of poor outcome using logistic regression

The clinical database was interrogated using univariate and multivariate logistic regression to test for variables that independently predicted poor outcome at 10 days and 40 days post admission with bacterial meningitis. Data were analysed using IBM SPSS/PASW version 17- 20. For continuous variables, parametric (Student t) and non-parametric (Mann-Whitney U)

109 tests were used to compare survivors and non-survivors; Fisher exact / chi-square tests were used to compare categorical variables. Backwards stepwise logistic regression

methods were used to estimate the influence of different variables on clinical outcome; each model was compared at each step with the previous model and only variables that remained significant at the 90% level were retained in the analysis for the next step. The first analysis plan denoted that only variables reaching statistical significance at or greater than the 95% level univariately were to be included in multivariate analyses. However variables that were predicted to be significant from previously published data that were found to be non-

significant on univariate analysis were subsequently included in the multivariate analysis to test the strength of the negative association. The strength of relationships was expressed using odds ratios with 95% confidence intervals. All statistical tests were two-tailed, and a p value of <0.05 was used to denote statistical significance.

This approach was used to analyse the data from the historical clinical database (Chapter 4), to synthesise the predictive outcome score (Chapter 5), to compare data in Phase 1 and Phase 2 of BAM (Chapter 6) and to identify predictors or poor survival outcomes from BAM (Chapter 7). Selection methods for the variables used in the multivariate models for each chapter can be found in the relevant methods sections of those chapters.

3.2.4 Statistical methods for specific chapters

a) Derivation and validation of a score to predict poor outcome from bacterial meningitis

The clinical database was divided randomly into separate derivation and validation cohorts. The BAM Phase 1 data was added to the validation cohort to obtain a case ratio of 2:1 between the two cohorts. The severity score was derived first by using methods detailed in section 1.4.3 to determine predictors of poor outcome using logistic regression. Variables were then put forward into a nomogram to calculate a total score with a predictive index. The Nomogram was then applied to the validation cohort and sensitivity and specificity, plus

110 predictive index of the nomogram was calculated. Detailed methods are available in Chapter 5, Section 5.2.

b) Methods to analyse the BAM study before/after design

Standard approaches to clinical trial analysis were applied to the BAM data to compare Phase 1 and Phase 2 data for each endpoint. These include Kaplan-Meier curves, Fisher’s exact and Chi square tests, logistic regression and composite analysis of individual elements of the clinical care bundle to test for the proportion of targets achieved. Full statistical

methods are detailed in this Chapter, Section 3.3.12.

3.3 Ethical approvals for studies on human subjects

In document Early goal directed therapy for adult meningitis in Malawi (Page 107-110)