Chapter 2 Patients and methods
2.5 Statistical analysis
2.5.4 Statistical models
2.5.4.3 Latent class growth analysis.
Latent growth curve models describe change in outcome over time, but the sample population is treated as one large group and the results represent
108
mean change for the whole cohort. Clinical experience demonstrates that inter-individual variation in RA outcome is likely to be significant. For example, some patients may have more aggressive forms of RA and experience greater disease activity than others. Similarly, patients respond differently to
treatment with the same medication. Latent class growth analysis (LCGA) can be applied to test the hypothesis that there are distinct groups of cases within a population, whereby each group follows a defined growth curve, or
trajectory. LCGA is a specific form of growth mixture modelling (GMM), in which groups of patients following a specific trajectory are identified, but the variation within groups is assumed to be zero. That is, the values of the mean intercept, linear and quadratic slopes are estimated for each group, but the variances of these parameters are constrained to zero (Dalhy, 2012). There is some debate surrounding the most suitable indicator of the optimal number of growth trajectories, but the Bayesian Information Criterion (BIC) is popular as it has been shown to perform well (Jung and Wickrama, 2008). Furthermore, the number of trajectories should be chosen with consideration of additional factors, including parsimony and interpretability of the model.
The present analyses were performed to test the hypotheses that there are distinct groups of patients with early RA, with each group following a defined growth curve, or trajectory for DAS28 and HAQ-DI. The values of Akaike information criterion (AIC) and BIC were compared for models with successive numbers of trajectories and smaller values of BIC and AIC indicated improved fit. In addition entropy, Lo, Mendell and Rubin likelihood ratio test (LMR-LRT) and the bootstrap likelihood ratio test (BLRT) were also considered. For the latter 2 tests, a significance level of p<0.05 was selected to indicate that the model fit was superior to a model with one less trajectory. Entropy is not exactly an indicator of model fit, but indicates how well subjects from the sample are classified into the trajectories described. It ranges from 0 to 1, with values approaching 1 indicating superior classification (Celeux and Soromenho, 1996).
The results of the LCGA were reported in terms of:
The number of trajectory classes identified
109
Average latent class probabilities for most likely latent class membership
A description of each trajectory
A graphical representation of each trajectory, compared to the mean values observed for cases grouped into each class
Once the number of trajectories was identified and each trajectory class described as above, variables predictive of class membership were tested using multinomial logistic regression. Predictor variables applied to this analysis were: YEAR cohort, gender, RF positivity, ACPA positivity, HLA- DRB1-shared epitope positivity, and IMD quartile. A sub-analysis of YEAR C data was also carried out to test the hypothesis that DAS28 and HAQ-DI are influenced by the three additional variables: smoking (pack years), obesity (BMI), and comorbidities (present or absent).
Using the numbers of trajectories for DAS28 and HAQ-DI identified by the latent class growth analyses, a dual trajectory analysis (Nagin and Tremblay, 2001) was conducted. That is, the trajectories of DAS28 and HAQ-DI were estimated simultaneously by a model that combined the parallel process model described in Section 2.5.4.2 and LCGA. The trajectories estimated by the model were described and compared to previous trajectories of DAS28 and HAQ-DI identified by the LCGA. It was then possible to determine the probability of a patient being assigned to one HAQ-DI trajectory group, based upon their DAS28 trajectory group. Predictors of DAS28 / HAQ-DI dual trajectory class were obtained using multinomial logistic regression.
2.5.4.3.1 Post-hoc analysis: non-inflammatory causes of patient reported disease activity and disability
Once the trajectory groups were identified, consideration was given to potential factors contributing to HAQ-DI, other than inflammation due to RA. These included increased age, comorbidity and non-inflammatory causes of pain and disability such as psychological distress and passive coping. Whilst age in years was readily available in this cohort, data were not collected on indices of psychological distress and data on comorbidity were limited. As an indicator of non-inflammatory components of disease, a ‘DAS28-P’ index was calculated, which was similar to the index described by McWilliams and
110
colleagues (reviewed in Chapter 1, Section 1.8.1). In YEAR, baseline ESR was missing in 36% of cases, compared to 8% missing for CRP, so a DAS28- P index was calculated using the CRP instead of ESR, by modifying the formula described by McWilliams et al. Thus, the DAS28-P(CRP) index was calculated as follows:
_______________(0.56 x √TJC28) + (0.014 x VAS) ___________________ (0.28 x √SJC28) + (0.56 x √TJC28) + 0.36 x Ln(CRP+1) + (0.014 x VAS) +0.96 The DAS28-P(CRP) index was included as a covariate in the multinomial logistic regression model of predictors of DAS28/HAQ-DI dual trajectory. Due to the limited information collected on co-morbidities (described in Table 2-2), and the large quantity of missing data on this variable (described in Chapter 3, Section 3.4), it was not possible to add this covariate into the
regression models. Instead, numbers of co morbidities were compared across DAS28 / HAQ-DI dual trajectory groups.
The LCGA, multinomial logistic regression analyses and dual trajectory
analysis were carried out using Mplus version 6.1 [Muthén, L. K., & Muthén, B. O. (1998-2011). Mplus User's Guide. Sixth Edition. Los Angeles, CA: Muthén & Muthén.].