Chapter 3 Observational Research
3.2 Observational Studies
3.2.3 Methods for Dealing with the Limitations of Observational Studies
There are many design and analysis features that can address the limitations of observational research although they cannot completely eliminate them 173. A number
include, but are not limited to, method matching (in case control studies), stratification of known confounding factors, direct adjustment when the relationship between the response, exposure and confounders is well established and multivariable regression analysis. More advanced techniques include the use of propensity scoring, instrumental variable methods, sample selection models and marginal structural models 184, 185.
3.2.3.1 Regression Analysis
Regression analysis is one of the most commonly used analytical techniques to control for confounding. It is based on modelling the mathematical relationships between two or more variables that give an approximate description of the observed data. After initially examining the relationship between the exposure of interest and the outcome (giving a “crude” or “unadjusted” result), variables that are known confounders are then added to the model to provide an effect that is “adjusted” for these known confounders
185. Multiple logistic regression is commonly used in observational research to assess
the relationship between certain exposures or treatments and a binary outcome while controlling for confounding and effect modifiers 198. This analysis allows for the
association between dependent and independent variables to be estimated while controlling for the influence of other independent variables 199. Multivariable regression
allows for models that include many confounders. The problem with such a method is that many factors may lead to the loss of power. Moreover, if the confounder is unknown or unmeasured, then this method is useless.
3.2.3.2 Propensity Score Analysis
with conventional multivariable techniques may not adequately balance the groups, and the remaining bias may limit valid causal inferences 198. Propensity score (PS)
analysis, which will be applied in this thesis, is another approach that is used with increasing frequency to account for confounding 200, 201. It is a type of statistical method
developed for estimating treatment effects with non-experimental or observational data
202. While it is by no means conceived as the best alternative to randomised
experiments, there is a consensus among prominent researchers that the PS approach has reached a mature level and that it is a tool that has much to offer researchers. It may adjust covariates between the two groups and reduce bias better than conventional multivariable modelling.
This method involves the generation of a score that summarises the confounding by multiple variables. The PS has been described as a conditional probability, between 0 and 1, that a subject will be treated based on an observed group of covariates. The collection of confounding covariates is collapsed into one score or propensity to have received one treatment over the other, based on a larger collection of covariates available in the dataset 198. The PS is known as a balancing score: conditional on the
PS, the distribution of measured baseline covariates is similar between treated and untreated subjects. Thus, in a set of subjects all of whom have the same propensity score, the distribution of observed baseline covariates will be the same between the treated and untreated subjects 203. When patients in one treatment group differ
systematically from those in other treatment groups, minimising the bias through the use of a PS can make direct comparisons more meaningful 198, 199. In practice, the PS
is often estimated using a logistic regression model, in which treatment status (i.e. untreated or treated) is regressed on observed baseline characteristics. The estimated PS is the predicted probability of treatment derived from the fitted regression model. Four methods of using the PS have been described in the literature: propensity score
matching, stratification (also known as subclassfication), inverse probability of treatment weighting (IPTW) and covariate adjustment using the PS 202-204.
PS matching entails forming matched sets of treated and untreated subjects with similar values of the propensity score. Although there are different approaches to matching, the most common approach in the medical literature is nearest neighbour matching without replacement within a specified caliper limit of the PS, and we will use this methodology in this thesis 201, 204-206. Using this approach, pairs of treated and
untreated subjects are formed such that the PS of the matched subjects lies within a specified distance of one another (the caliper width). Naïve matching is also possible, where no limit is placed on the caliper width. All treated subjects are matched to untreated subjects until every subject has been matched. This method assumes that within the matched sample, treated and untreated subjects have similar distributions of baseline covariates.
Stratification on the PS involves comparing outcomes between treated and untreated subjects within strata 204, 205. The most common approach is to use five approximately
equal sized strata defined by the quintiles of the PS. Rosenbaum and Ruben demonstrated that stratifying on the quintiles of the estimated PS eliminates approximately 90% of the bias due to measured confounders 207. Treated and
untreated subjects are compared within each stratum. This method assumes that within each stratum, treated and untreated subjects have a similar distribution of baseline covariates 204, 205.
covariates is the same in treated and untreated subjects. The distribution of covariates is changed in both the treated and untreated subjects so that they are the same as the distribution in the entire sample 203.
The final method of propensity score analysis is covariate adjustment. In this method, the outcome variable, which for this research would be ‘SVR achieved’ or ‘SVR not achieved’, is regressed on a dummy variable denoting treatment status, with or without
other covariates and the estimated propensity score. When the outcome is binary, a logistic regression model is used. The effect of treatment is determined using the estimated regression coefficient from the fitted regression model 203, 204.
Each method of propensity score analysis has advantages and disadvantages 208. PSM
is the superior method for reducing bias. Additionally, it is a transparent and easy-to- follow method and, similar to a RCT, the outcomes between treated and controlled participants in PSM can be directly compared. PSM is also considered to be the most robust method to misspecification of the propensity score model (e.g. excluding a covariate or including too many covariates). However, the exclusion of unmatched treated participants (and possibly unmated controls) can decrease the precision of the estimates 203, 209, 210. The major advantage of PS stratification is that it includes all
available study subjects and therefore, increases the precision of the estimates and is, again, robust to misspecification. The main disadvantage of this method is that, despite the fact that the division of the propensity scores into five strata has been shown to eliminate 90% of the bias from measured confounders, stratification reduces bias less than other methods 203, 209, 211, 212. As in PS stratification, covariate adjustment uses all
included study subjects. However, this technique is not considered ideal for several reasons. The model produces odds ratios and hazard ratios but does not allow
treat and has been show to produce more biased estimates of these ratios. Additionally, the assessment of balance in baseline covariates between treatment groups is more cumbersome than other methods 203, 212-214. And finally, while IPTW
reduces bias more than stratification and covariate adjustment, misspecifcation in the propensity-score model may have a substantial impact on the weights applied and therefore, can become unstable 203, 209, 211, 212.
The large number of covariates in our data that were required to be included in the propensity-score model, the recognised robustness of PSM, the fact that PS stratification allowed for all data to be included in the analysis and given that both of these methods are the most commonly reported methods in published literature resulted in the application of these two methods in our observational dataset 215-217.
Covariates were selected for inclusion following a review of the literature and consultation with clinicians 218-222. Prior knowledge of covariates that impact the
outcome is important. Only covariates that simultaneously influence the treatment assignment and the outcome should be included. There is debate over the impact of including too few or too many variables in the estimation of the PS with some suggesting that it may impact the variance around the outcomes but there is universal agreement that the choice of variables should be based on previous findings and expert advice 223, 224.
For each method, the relative balance in measured baseline covariates is compared using the standardised difference 204. The standardised difference compares the mean
sample size. To assess the balance in covariates after PS matching, the balance in measured baseline covariates between treated and untreated subjects within the PS matched sample were compared for all three approaches using standardised differences. For stratification, within-strata standardised differences are computed to compare the distribution of baseline covariates between treated and untreated subjects within the same PS stratum. The mean standardised difference is then determined across the strata 204. Although there is no universal criterion as to what threshold of
standardised difference can be used to indicate important imbalance, a standardised difference that is less than 0.1 has been taken to indicate a negligible difference in the mean of a covariate between treated and untreated subjects 203.
3.2.3.3 Multiple Imputation
Missing data are unavoidable in clinical research 225. Missing data refers to unrecorded
values, which, if recorded, would be meaningful for the analysis and interpretation of a study 226, 227. It can pose a threat to the validity of observational outcome analysis 228.
The intent of any analysis is to make valid inferences from the data, and therefore, meaningful assessment of patients’ outcomes in observational studies requires a
reliable and accurate measure of the outcome itself.
Researchers have often addressed missing data by including only complete cases in the analysis – those individuals who have no missing data in any of the variables required for the analysis. However, this often leads to the exclusion of a substantial proportion of the original sample, which in turn causes bias, loss of precision and power 225. The risk of bias due to missing data depends on the reasons why data are
missing.
3.2.3.4 Dealing with Missing Data
Given that nearly all statistical methods presume complete information for all variables included in the analysis, it is essential that appropriate methods are employed to deal with missing data 229. It is possible to distinguish between two patterns of missingness.
Data is missing monotone if a pattern is observed among the missing values. Data is missing arbitrarily if there is no way to order the variables to observe a clear pattern (Figure 12).
Sourced by Soley-Bari 229
Statistical imputation techniques should be used to estimate or approximate missing data by modelling the characteristics of cases within missing data to those who have such data. There are several methods to deal with missing data, including ad hoc
methods such as case-finding, case deletion, mean substitution and more principle methods such as maximum-likelihood methods, multiple imputation (MI) and others. When it is plausible that data are missing arbitrarily, analysis based on complete cases may be biased. Such biases can be overcome using methods such as MI that allow individuals with incomplete data to be included in the analysis 230. In this thesis, we will
employ the MI methodology to deal with missing data, where appropriate.
The goal of MI is to provide valid inference in scenarios in which the data is incomplete and when reasons for missing values are not known. One of the basic objectives of MI is to enable the inclusion of complete-data in the analysis 230. MI is a general method
that incorporates the uncertainty into the imputation process. MI is comprised of three stages: imputation stage, in which missing data are imputed; analysis stage, in which each complete dataset is analysed using complete-data technique; and the last stage, in which the results from the analysis are combined in order to yield a final result that combines the uncertainty in the data and the uncertainty due to missing values. During the imputation stage, imputed values are drawn from a distribution, and therefore inherently contain some variation. Thus it solves the limitations of single imputation by introducing an additional form of error based on variation in the parameter estimates across the imputation. It replaces each missing value with two or more acceptable values, representing a distribution of possibilities 225.
The advantages of this method are 231-233:
It produces unbiased estimates of standard error and thus ultimately helps to
preserve original available data distribution, providing more validity than ad hoc approaches to missing data.
It uses all available data, preserving sample size and statistical power.
It introduces random variation, which enhances the possibility to have unbiased
estimates of all parameters.
The results are readily interpretable.
It is suitable for dealing with data missing arbitrarily.