Formal model - The study of substance use in longitudinal research

4.5 Conclusions

5.1.1 Formal model

Latent growth models (LGM) are the first simple step in the study of the development of a particular outcome over time. These models are estimated within the more general statistical framework of structural equation models (SEM) (Bollen, 1989), which allow the simultaneous estimation of models with observed and latent variables, and also the inclusion of measurement error (Schumacker & Lomax, 2004). Since the main purpose of latent growth models is the definition of a mean trajectory that summarize the development of a particular behavior over time, a necessary condition is the availability of longitudinal panel data. In the case of drug use, for instance, the interest lies in the estimation of a developmental pattern that summarizes drug use behavior of the entire sample. The first step consists in the estimation of individual trajectories for each single subject. This is done by means of a simple equation of a line (simple regression equation), where the behavior of interest becomes a function of time:

yt= α + λtβ + t (5.1)

yt is the measured frequency of drug use at each time point t, λt is the variable

“time” (here the independent variable), α is the intercept and measures the level of drug use at time 1, and β is the slope, which measures the steepness of the line and thus the speed in the increase/decrease in consumption across time. Thus, if we assume that the relationship between drug use and time is linear, then we can easily estimate a line for each individual, that represents his/her individual development in consumption across the measured time points. Is the relationship not linear but rather curvilinear, a quadratic term can be included in the equation:

yt= α + λtβ1+ λ2tβ2+ t (5.2)

The new β2 coefficient estimates the curvature of the trajectory, and the λ2t is the

square of the time variable. This formula represents the equation of a curve. The values for λt are used to specify the expected shape of the trajectory, and for this purpose are

generally fixed as follow: for the intercept the factor loadings are all fixed at 1, so that the α represents for each individual the y value at the first time point; for the linear slope they are fixed to 0,1,2,3,4 to represent a linear development; finally, for the quadratic slope they are fixed to 0,1,4,9,16 to represent a curvilinear growth in the equation. In some cases, when a perfect linear or curvilinear development is not expected, some of this restrictions can be relaxed, and some of the λtcan be freely estimated in the model

(see Bollen & Curran, 2006). In sum, once a trajectory is estimated, the parameters of interest are:

• α (I) intercept: level of the outcome at time 1 • β1(S) linear slope: steepness of the development

• β2(Q) quadratic slope: rate of curvature

In the case of drug use, with 1552 individuals observed at five time points, 1552 trajectories can be estimated. Figure 5.1 reports the trajectories for five individuals taken at random.

5.1. Latent growth models

Figure 5.1: Five random individual trajectories

On the x axis is reported the time (five time points), and the y axis represents the frequencies of the outcome variable. There is heterogeneity in the development of the five random subjects. For instance, the subject at the bottom of the figure do not report drug consumption across the whole time span. By contrast, the two individuals at the top, show respectively a bell shaped development that peaks at time four and decreases thereafter, and a fairly constant high level of use across the whole time span.

Figure 5.2: 200 random individual trajectories

The picture above shows how LGM estimates individual developmental trajectories for each single subject in the sample; these trajectories are allowed to be different from each other on the estimated developmental parameters presented above. In case of a quadratic developmental process, individual trajectories can be different according to the intercept, slope and quadratic slope values. This can result, as shown in Figure 5.2, in large heterogeneity in the shape of the estimated individual curves. As a consequence, the interpretability of the results becomes quite difficult.

In a second step, the goal of latent growth models is to summarize intra-individual in- formation about development, and to reproduce them mathematically. For this purpose a single mean trajectory can be estimated as the mean of the individual developmental parameters. It represents inter-individual differences in the development. In the case of a quadratic (curvilinear) development the measurement and structural equations are:

yit= αi+ λtβ1i+ λ2tβ2i+ it (5.3)

αi= µα+ ζαi (5.4)

β1i= µβ1+ ζβ1i (5.5)

β2i= µβ2+ ζβ2i (5.6)

The first equation represents the measurement part of the model, and reports the intra-individual development of the outcome variable across time. The suffix i, in fact, points out that for each subject in the sample an individual trajectory is estimated based on the curvilinear equation presented above.

The other three equations represent the structural part of the SEM model and are used to estimate the average intercept and slopes of the mean sample trajectory. These latent variables consist of the mean values of the individual developmental parameters plus the residual (ζ). Since there are no covariates in the model, the latter value represents the

5.1. Latent growth models

deviation of the latent variables from the sample means. These values can be used to calculate the variance/covariance matrix for the latent variables, which represent also the variance around the mean trajectory:

Ψ =       ψ11 ψ21 ψ22 ψ31 ψ32 ψ33       (5.7)

Within the SEM framework, a latent growth model can be easily represented by a graphic. Those representations of models have become widely used within the SEM research field. An example is given here for a LGM in Figure 5.3:

Figure 5.3: Latent curve model

The coefficients in the oval boxes represent the latent variables (in this case latent factors); the boxes are the observed variables used to define the latent factors (e.g. drug use measured at five time points). The linear and quadratic development is captured by the factor loadings (λt and λ2t in the equation, and the arrows in the graphic) that are

fixed to represent a quadratic trajectory as specified above. Error terms for the observed outcomes and variance for the latent factor are estimated as well (small arrows pointing to the boxes in the figure), and, if needed, equality restrictions can also be applied on them (Bollen & Curran, 2006; T. Duncan et al., 2006).

In document The study of substance use in longitudinal research (Page 74-77)