Latent Variable Modeling of Differences and Changes with Longitudinal Data

(1)

Latent Variable Modeling of

Differences and Changes with

Longitudinal Data

John J. McArdle

Department of Psychology, University of Southern California, Los Angeles, California 90089-1061; email: [email protected]

Annu. Rev. Psychol. 2009. 60:577–605

First published online as a Review in Advance on September 25, 2008

TheAnnual Review of Psychologyis online at psych.annualreviews.org

This article’s doi:

0066-4308/09/0110-0577$20.00

Key Words

linear structural equations, repeated measures

Abstract

This review considers a common question in data analysis: What is the most useful way to analyze longitudinal repeated measures data? We discuss some contemporary forms of structural equation models (SEMs) based on the inclusion of latent variables. The specific goals of this review are to clarify basic SEM definitions, consider relations to classical models, focus on testable features of the new models, and provide recent references to more complete presentations. A broader goal is to illustrate why so many researchers are enthusiastic about the SEM approach to data analysis. We first outline some classic problems in longitudinal data analysis, consider definitions of differences and changes, and raise issues about measurement errors. We then present several classic SEMs based on the inclusion of invariant common factors and explain why these are so important. This leads to newer SEMs based on latent change scores, and we explain why these are useful.

Annu. Rev. Psychol. 2009.60:577-605. Downloaded from arjournals.annualreviews.org

(2)

Contents

INTRODUCTION:

LONGITUDINAL DATA AND THE STRUCTURAL

EQUATION MODELING

APPROACH . . . 578

Separating Differences from Changes . . . 579

The Structural Equation Modeling Approach . . . 580

STRUCTURAL EQUATION MODELS FOR REPRESENTING CHANGES . . . 581

Auto-Regression Models . . . 581

Change Score Models . . . 583

Change-Regression Models . . . 583

STRUCTURAL EQUATION MODELS FOR ADDING GROUP DIFFERENCES . . . 584

Group Information as Contrast Codes . . . 584

Multiple-Group Latent-Difference Models . . . 584

Multiple Group Structural Equation Model Estimation with Incomplete Data . . . 586

STRUCTURAL EQUATION MODELS FOR INCLUDING LATENT COMMON FACTORS . 587 Regression with Common Factors . . 587

Latent Changes in Common-Factor Scores . . . 587

Questions of Factorial Invariance Over Time . . . 589 STRUCTURAL EQUATION MODELS USING TIME-SERIES CONCEPTS . . . 590 Crossed-Lagged Regression of Factors . . . 590

Extending Time-Series Factor Models to Multiple Occasions . . . 591

Extending Cross-Lagged Factor Regression to Multiple Occasions . . . 592 STRUCTURAL EQUATION MODELS USING LATENT-CURVE CONCEPTS . . . 592

Latent Growth-Curve Models . . . 593

Fitting Latent-Curve Hypotheses . . . 593

Considering Multiple Latent Curves . . . 594

STRUCTURAL EQUATION MODELS USING LATENT-CHANGE CONCEPTS . . . 595

Mixing Models for Means and Covariances . . . 595

Latent Change Score Models . . . 596

Multiple Latent Change Score Models . . . 597

ADDITIONAL RELEVANT RESEARCH ISSUES . . . 598

New Dynamic Structural Equation Model Applications . . . 598

Other Promising Directions . . . 600

Final Comments . . . 601

INTRODUCTION:

LONGITUDINAL DATA AND THE STRUCTURAL EQUATION MODELING APPROACH

This review describes contemporary methods for longitudinal data analysis. So let us start by considering situations in which individuals (N) from different groups (G) have been

mea-sured over several discrete periods of time (T< N) on a repeated set of measurements (Ym). In

an experimental design, this could represent a classical layout wherein we randomize individ-uals to different conditions and measure every-one at multiple time points on multiple vari-ables (Bock 1975). In an observational study, or longitudinal panel design, we could measure

(3)

different demographic groups on multiple oc-casions over many years (Hsiao 2003). Of course, a variety of classical techniques are used for analyzing such data, including repeated measures analysis of variance, time series anal-ysis, and growth curve analysis (Nesselroade & Baltes 1979). There are also many newer tech-niques that serve similar purposes (Hedecker & Gibbons 2006, Muller & Stewart 2006, Singer & Willett 2003, Verbeke & Molenberghs 2000, Walls & Schafer 2006).

A brief glance of the past few decades of Annual Review volumes shows the impor-tance placed on statistical models for data analyses, and a few previous reviews have al-ready focused on SEM principles and tech-niques (Bentler 1980, Bentler & Dudgeon 1996, Bollen 2002, MacCallum & Austin 2000, Tomarken & Waller 2005). These reviews dis-cuss technical aspects of SEM, the nature of la-tent variables, SEM hypothesis formation and statistical testing, and even the misuse of SEM. Other Annual Review articles provide informa-tive discussions about the key issues in repeated measures analysis (e.g., Collins 2006, Cudeck & Harring 2007, MacKinnon et al. 2007, Maxwell et al. 2008, Raudenbush 2001). The many good general references to SEM (e.g., Kline 2005, McDonald 1985) include several new books speciﬁcally about SEM for repeated measures (e.g., Bollen & Curran 2006, Duncan et al. 2006). In contemporary work, it has become popular to focus on the trajectory over time as the key feature of a repeated measures analysis (e.g., MacCallum et al. 1997; McArdle 1986, 1989; Raudenbush 2001). The trajectory ap-proach has gained popularity, and for the most part, it nicely matches the scientiﬁc goals of lon-gitudinal research (Nesselroade & Baltes 1979). Adding something new and useful to this im-pressive collection is not so easy.

In this review, we survey a variety of differ-ent ideas about data analysis, but we try to bring these together by our explicit focus on what we term the “latent change score” model. It is expected that readers well versed in the tech-niques of regression analysis and factor analysis will ﬁnd this review to be elementary reading

and probably to be missing many technical de-tails. But this is intentional. To reach a wider audience, we do not present the typical alge-braic expressions or computer program SEM scripts, and all the new models are explained using simple plots or path diagrams. Because we survey models from a wide variety of sci-entiﬁc disciplines, where alternative terms are often used for the same mathematical and sta-tistical concepts, we do not include all aspects of these models. Our hope is that our exten-sive use of path diagrams will help the reader see the common features of these theoretically appealing and practically useful ideas. And we hope this approach will clarify our main recom-mendation: When thinking about any repeated measures analysis it is best to ask ﬁrst, what is your model for change?

Separating Differences from Changes

The many semantic conventions and colloqui-alisms used by psychological researchers do not always precisely match their formal definitions. A first issue here is the distinction between in-ferences about (a) differences between people and (b) changes within people. We use the plots of fictional scores over time inFigure 1to clar-ify this key distinction.

In Figure 1a, we have plotted six pairs of X-Ydata points to show hypothetical data ob-tained from six different individuals. To pro-vide a substantive context, we label theY-axis “Alert” and we label theX-axis “Time of Day.” We add a regression line whereX→Y(with in-terceptβ0 and slope β1). Because this

regres-sion line has a negative slope, our substantive interpretation might be that alertness decreases as the day goes on. In more technical terms, the difference between people on predictor X leads to an expected difference in outcomeY. To avoid causal language, we can express the slope (β1) as the expected change in Y for a

one-unit change inX. Notice we use the word “change” to describe the “difference” between persons.

Next consider the different data layout of

Figure 1b. Here we view the same six pieces

(4)

Figure 1

Alternative plots of cross-sectional and longitudinal data. (a) Cross-sectional measurements, (b) longitudinal measurements, (c) one longitudinal alternative, and (d) another longitudinal alternative.

of information as coming from just two people measured at three time points on the same mea-sure (i.e.,Y[1],Y[2], andY[3]). To indicate the same person over time, we can connect the dots with lines, and now see both lines go down as the time-of-day increases. A simple subtraction of any two scores for any person is termed a “difference score” (e.g.,D = Y[2]−Y[1]), and these are changes within a person over occa-sions. Now we use the word “difference” to de-ﬁne a “change.”

The six data points inFigures 1aand1bare in exactly the same positions, and there is no al-teration of the inference aboutXproducingY. Of course, inFigure 1bwe measure fewer peo-ple. To anticipate a reasonable statistical ques-tion, we can state (without proof ) that, given the same number of data points, there is typically a

gain in statistical precision by adding more oc-casions of measurement with the same people— i.e., repeated measures lead to increased power (Bock 1975, Hertzog et al. 2006, Muth´en & Curran 1997).

InFigure 1cand1d, the same six data points are plotted, but we connect these data across two people in a different way. In Figure 1c, it appears as if the two people shift in their position from Time 1 to Time 2—one line goes down while the other line goes up. In

Figure 1d, we connect other points, and now both lines appear to be ﬂuctuating between ups and downs. By using the same data points, we can see that many different longitudinal pat-terns are possible underneath any set of cross-sectional scores. This highlights the key pur-pose of most longitudinal repeated measures data—to detect differences in the patterns of individual changes.

In theory, we can calculate the change scores and write another regression equation in which change is the dependent variable predicted by some difference between the people. This is an elementary description of the well-known mixed model, which attempts to identify the “between-person differences in within-person changes” (Nesselroade & Baltes 1979). It is worth noting that seminal statements made by some of the most important leaders of our ﬁeld strongly advocated the need to avoid change scores (e.g., Cronbach & Furby 1970, Lord 1958). These statements focused primarily on the very real problems of measurement error in the change scores. In contrast, other re-searchers who investigated these statistical is-sues emphasized the beneﬁts of using change scores (i.e., Allison 1990, Nesselroade & Cable 1974, Rogosa 1979, Rogosa & Willett 1983). It is not surprising that the appropriate use of change scores remains a conundrum for many researchers.

The Structural Equation Modeling Approach

It is well known that SEMs are used to ex-press a theoretical model in terms of linear and

(5)

nonlinear expressions with observed and unob-served variables (Goldberger & Duncan 1973). SEM expressions lead to predictions or expec-tations for the means, standard deviations, and correlations, and these can then be compared to observed statistics. A series of alternative mod-els, often based on radically different ideas, can be organized in this way and then compared with one another using a variety of goodness-of-fit tests. In the early years of SEM, only a few re-liable computer programs were able to carry out these calculations (e.g., ACOVSM, J öreskog et al., 1971; LISREL, J öreskog & S örbom 1979; COSAN; McDonald 1985). Today, many SEM programs exist, ranging from the most flexible (Mplus; Muthén & Muthén 2002), to the most graphic (AMOS; Arbuckle & Wotke 2004), to the least expensive (Mx; Neale et al. 1999). SEM programs are often hard to choose among, and alternative computer scripts can be very help-ful (e.g., Ferrer et al. 2004). All SEM programs can carry out the calculations for the models de-scribed here, so computer program differences are not highlighted.

Important lessons have been learned from statistical research on the classical tests of mean differences using repeated measures analysis of variance (ANOVA). Research in the early 1970s showed that the most popular ANOVA tests were based on an assumption of an equal variance and an equal correlation over time, a pattern termed “compound symmetry” (e.g., McCall & Applebaum 1973). Unfortunately, these assumptions seemed highly improbable with real longitudinal data. As it turned out, features of the tests of the mean differences over time were influenced by the adequacy of the covariance structure assumptions. In stan-dard testing of the mean differences, (a) the Type I error rate (e.g., α = 0.05) is inflated if the simple covariance assumptions are not met, but (b) the Type II error (e.g., 1-power) is inflated if no structure is placed on the covari-ances. The suggested correction at the time was to alter the degrees-of-freedom by a coefficient (termedε) computed from the covariance ma-trix or to use the unstructured but less-powerful multivariate approach (MANOVA; McCall &

Applebaum 1973, O’Brien & Kaiser 1985). At the same time, other researchers were promot-ing a different kind of data analysis approach, eventually termed SEM (e.g., J ¨oreskog et al. 1971), wherein the mean and the covariance hypotheses could be considered jointly by what are now termed “shared parameters” (McArdle et al. 2005).

The approach presented here emphasizes the need for explicit structural hypotheses about means and covariances and for the direct in-clusion of latent change scores to express spe-ciﬁc developmental hypotheses about individ-uals and groups (e.g., McArdle & Nesselroade 1994, Nesselroade & Baltes 1979). SEM tech-niques are used to translate these speciﬁc hy-potheses into structural expectations for the means and covariances over time so these ex-pectations can be compared with a real set of longitudinal data. SEM path diagrams are used as a shorthand to convey aspects of the required matrix algebra. These diagrams highlight the key parameters we can test as well as the as-sumptions we cannot test. The path diagrams used here are intentionally more elaborate than are other SEM representations because these diagrams express every algebraic relationship among the scores.

STRUCTURAL EQUATION MODELS FOR REPRESENTING CHANGES

We ﬁrst consider some of the key longitudinal questions about change using popular models for two occasions of data. The basic techniques described here are used in most of the other SEM examples to follow.

Auto-Regression Models

The ﬁrst kind of model to be considered here is the familiar auto-regression model, drawn in

Figure 2a. We label the two repeated scores as Y[1] and Y[2], and we presume Y[1] “pre-cedes”Y[2] in time. This order of events sug-gests we add a regression (β) whereY[1] is used to “predict”Y[2] at the later time (i.e.,Y[2] is

(6)

Figure 2

Alternative structural equation models for two-occasion data. (a) Traditional regression, (b) latent change score, and (c) change regression.

regressed onY[1]). As in any regression calcu-lation, we further assume that the unobserved residual term e is uncorrelated with the ini-tial scoreY[1]. In the traditional path diagram, there are no intercepts, and variables are often assumed to be standardized. But in this diagram, we explicitly draw all model parameters: (a) ob-served variablesY[1] andY[2] as squares, (b) the unobserved variableeas a circle, (c) the implied constant of1as a triangle, (d) one-headed ar-rows to represent “ﬁxed” or group effects (μ1, β0,β1), and (e) two-headed arrows to represent

“random” or individual effects (σ12,ψe2, 1).

Al-though these parameters complicate this sim-ple ﬁgure, they prove very useful in subsequent model comparisons.

We can now use any SEM program to estimate values for this model. We start by calculating the means and covariances (or aver-age cross-products) formed from observed raw scores. Numerical estimates for all unknown parameters (i.e., Greek letters) are obtained as well as a single index of goodness-of-ﬁt of the model to the data (i.e., the likelihood L). To create a formal test of this model, we need to compare it to an alternative model,

(7)

usually with different parameters. The SEM approach proves remarkably flexible here because parameters can be (a) free to vary, (b) fixed at any known value, or (c) set equal to any other parameter. For example, one typical alternative to model 2a is another in which there is “no stability” over time, and this can be formed by restricting the regression slope to be fixed at zero (β1 = 0). Under standard

regularity conditions (e.g., normality of the residuals), the difference in ﬁt between the two models (Ld = La−Lb) is distributed as a

chi-square (χ2_{) variate with one degree-of-freedom}

(dfd = dfa−dfb). This SEM approach is not

novel and it yields the same results we could obtain using any standard linear regression program.

Change Score Models

It is simple to subtract the two scores for each person using the observed data (D = Y[2]₋Y[1]), with the results known as “gain scores” or “difference scores.” But, as basic as this seems, calculating differences from the raw data is not the most promising route to take here. Instead, the model of Figure 2bis a change score model for the same initial ob-servations. We start with the same data (Y[1] and Y[2]), but we add an unobserved variable labeledΔ. To this we add a set of fixed values (=1) on the specific arrows so we can mimic the result of a subtraction (Y[2] = 1∗Y[1]+1∗Δ). This change score (Δ) is now explicitly defined as “the part of the score ofY[2] that is not iden-tical toY[1].” This change score is not directly measured, so it can be considered as our first latent change score (McArdle & Nesselroade 1994; cf., Bollen 2002).

We can now use SEM software to estimate and test questions about changes directly from the original two-occasion data. The traditional statistical features of the change score are all in-cluded as model parameters—the mean of the changes (μ_Δ), the variance of the changes (σΔ2), and the covariance of the initial scores with the changes (σ1Δ). For example, we can now test the hypothesis of “no mean differences over time”

by forcing the mean of the difference to be zero (μ_Δ = 0). This model leads to expectations of equal means over time, and the difference in ﬁt is indexed by a chi-square test (χ2_{). We can}

use the same model to test hypotheses about individual differences in change (σΔ2 = 0,

σ1Δ = 0).

We do not need to calculate the change scores directly to examine their statistical prop-erties when we use the model of Figure 2b. Instead, we define a latent change score by us-ing fixed unit values; this simple SEM tech-nique also proves valuable in the more-complex models presented below. The auto-regression model ofFigure 2bis fit to observed data that are identical to those of the change score model ofFigure 2b, and both models have the same number of parameters and achieve the same fit—i.e., these models are not testable alterna-tives of one another. Instead, the fundamental difference between auto-regression and change score models is in the way we represent and test hypotheses about the within-person changes. The change statistics are nearly impossible to describe inFigure 2a, but these are explicit pa-rameters ofFigure 2b.

Change-Regression Models

Further consideration about the two models discussed above leads to another question in change research: Should we remove the part of the individual change that is related to the initial level? InFigure 2c, we draw a slightly revised version of Figure 2b, in which we transform the covariance (σ1Δ) into a regression coefﬁ-cient (δ1) to estimate a model with a base-free

measure of change. This is also a transformation of parameters inFigure 2a(i.e., withδ0 = β0, δ1 =β1–1) and is useful to formalize this simple

interpretation. In addition, the transformation provides one way to deal with the classic prob-lem known as Lord’s Paradox (Lord 1967)— the difference in results obtained from the re-gression inFigure 2aand the change model in

Figure 2bare avoided if we use the change-regression model in Figure 2c (with group information).

(8)

Estimating this change score regression (δ1) is mainly useful when the changes have not taken place by the time of the initial occasion. This is assured, for example, in an experiment wherein a manipulation occurs between Time 1 and Time 2. In contrast, in observational research, the two occasions may be arbitrary selections from an ongoing process unfolding over time, possibly in different ways for dif-ferent persons. In this case, the changes may already be apparent at the time of the initial data collection, so this change regression is only an arbitrary transformation. This is a case where SEMs regressions yield parameter esti-mates that may be very difﬁcult to interpret.

STRUCTURAL EQUATION MODELS FOR ADDING GROUP DIFFERENCES

Group differences are important aspects of both experimental and observational longitudi-nal studies. As is well known, the use of random assignment to groups provides a direct basis for causal inference. In observational studies, a fre-quent goal is to separate groups that are not following the same process (i.e., heterogeneity). Group information may be considered in many ways, and the techniques described in this sec-tion are relevant to all SEMs in the discussion that follows.

Group Information as Contrast Codes

Let us assume that an important difference ex-ists between groups of people (e.g., due to a manipulation, based on gender, linked to high test scores), and we want to examine how these differences impact some outcome. One popular model for this purpose is drawn asFigure 3a. This is a change score path model that also includes the group information as a measured variable (G) using dummy codes (G = 0 or 1) or effect code (G = – 1/2 or+1/2). This typi-cal use of group differences as a coded variable allows standard regression parameters to esti-mate mean differences between groups. With dummy codes, these path coefﬁcients represent

a 2-by-2 ANOVA with four parameters: (a) an initial mean (β0 = 1→Y[1]), (b) a

between-group effect (β1 = G →Y[1]), (c) a

within-group effect (α0 = 1→Δ), and (d) a

within-by-between effect (α1 = G→Δ). Other aspects

of the ANOVA include the variances and co-variance of the residuals of the level and change score (ψe2,ψz2,ψez).

As in any regression formulation of ANOVA among K independent groups, we need K-1 contrasts to fully represent the mean differ-ences. A common variation of this model is the use of “adjusted” change parameters in the analysis of covariance (ANCOVA). This is drawn in the same way asFigure 3a, but with the addition of a continuous variableX as an observed predictor of both the initial level and changes. In ANCOVA, the model parameters are conditional on the expected values of the measuredXvariable. Many researchers use the term “controlled” for this form of statistical ad-justment, and this is reasonable in some cases. We must recognize that the statistics are un-der our control but the individuals are not. The ANCOVA interaction term, representing po-tential differences in the slopes of the covari-ate between groups, can be introduced in path models using product terms as measured vari-ables (P = G∗X). In this way, any ANCOVA can be carried out as an SEM.

Multiple-Group Latent-Difference Models

This previous use of group coding is limiting in a number of ways. The focus of this kind of analysis is on differences in the mean changes over groups, and other forms of group differ-ences in change processes are not typically con-sidered. Different groups of people may have different means, but they may also have dif-ferent amounts of variability in their changes (σΔ2). The SEM approach expands our options for considering aspects of group differences.

Figure 3brepresents a latent change score model in which we have assumed there are two independent groups of individu-als, perhaps differentiated by an experimental

(9)

Figure 3

Alternative two-occasion structural equation models with group differences. (a) Adding group codes, (b) multiple group model, and (c) incomplete data groups.

treatment (e.g., treatment versus controls), a demographic difference (e.g., males versus fe-males), or an observational difference (e.g., high versus low math scores). This organization of the data into groups allows for tests of group dif-ferences by using a multiple-group SEM. In this example, the group means are represented as re-gressions from the constant within each group (1→Y[1] = μ1(a),μ1(b)). A test of the equality

of these coefﬁcients is termed “invariance over groups” (μ1(a) = μ1(b)), and this can be carried

out using the SEM programs. These invariance constraints will result in a misﬁt (χ2_{) of the same}

magnitude as tests of no mean differences be-tween coded groups (β1 = 0 in Figure 3a).

Next, the mean of the changes (μ_Δ(a)_,_μ

Δ(b)) can be tested for invariance over groups (μ_Δ(a) ₌ μΔ(b)) as a test of group-by-time interaction.

This multiple-group SEM allows testing the invariance of any model parameter. In this case, we might want to add a test of the equality-of-change variation over groups (_σ_Δ2(a) ₌ _σ

Δ2(b))

(10)

to see if there are group differences in the amount of changes (using the _χ2_{). A more}

complex expression can be formed to test the equality of the “coefﬁcient of variation” or ef-fect sizes (μ_Δ(a)_/_σ

Δ(a) = μΔ(b)/σΔ(b)). Follow-ing a similar logic, we can easily represent and test interactions, even interactions including la-tent variables, without creating product vari-ables. In the typical MANOVA analyses, we require complete homogeneity of covariance over groups (e.g., Bock 1975, O’Brien & Kaiser 1985), but some SEM alternatives, with less-extreme forms of invariance, may be more re-alistic and useful.

Multiple Group Structural Equation Model Estimation with Incomplete Data

In practical situations, we often have repeated measures data wherein some individuals are not measured at all occasions. In some de-signed experiments, we may plan not to mea-sure some of the subjects to estimate the im-pact of measurement (e.g., incomplete blocks design, Solomon 4-group design). But in most observational studies, some participants drop out after the ﬁrst occasion, usually for a vari-ety of different reasons. As pointed out above, it is difﬁcult to estimate changes when only one measurement occasion is available. So unless there is a compelling reason to do otherwise, persons who drop out of the study are typically dropped from all subsequent data analysis. This use of complete cases often seems to be the only possible analysis, and we generally view this as a conservative approach that avoids overstating our results.

Recent statistical research has focused on this incomplete data problem and has demon-strated how the previous statements about complete-case analysis are not typically true (Enders 2001, Little & Rubin 2002). In fact, well-intentioned complete-case analyses are likely to yield unintentionally biased results. A typical indicator of attrition bias due to dropouts is expressed as the mean differences at Time 1 between groups that (a) participate at

both occasions and (b) those that are not avail-able at the second occasion. When we ﬁnd mean differences between these groups we have se-lection bias, and the question becomes, what inferences are now possible?

One of the more popular features of SEM is the ability to deal directly with common problems of incomplete data. Following the well-developed lead of many statisticians (Hsiao 2003, J öreskog & S örbom 1979, Little & Rubin 2002), any change model can be written in terms of a sum of misfits (Lg) for multiple

groups, where groups are deﬁned as persons with the identical pattern of complete data. In

Figure 3c, we present a change-score model for two occasions for one group with complete data (Y[1] andY[2]) and a second change-score model for the group of individuals who are missing data at Time 2. The difference between the groups is only thatY[2] is observed in group A (drawn as a square) but is unobserved in group B (drawn as a circle; Horn & McArdle 1980, McArdle & Bell 2000).

Many alternative estimation techniques are available to deal with these problems (e.g., mul-tiple imputation; Little & Rubin 2002), but the SEM approach is relatively easy to understand (McArdle & Bell 2000, Enders 2001). If we want to make an inference about all people as if they were from the same population of interest, we must assume invariance of all parameters over all groups (e.g., μ1(a) = μ1(b), μΔ(a) = μΔ(b),

σΔ2(a) = σΔ2(b)). If these invariance assump-tions yield a reasonable fit, we may conclude the incomplete data are missing completely at ran-dom (MCAR). However, if the multiple-group invariance constraints do not fit well (i.e., a sig-nificant χ2_{), we may conclude that they are}

missing at random (MAR). In either event, re-quiring invariance of all parameters provides the best estimate of the population parameters of the latent differences as if everyone had con-tinued to participate. Thus, we accept any loss of ﬁt associated with this form of invariance, and we compare alternative models with this misﬁt as our new baseline.

In general, this multiple-group SEM ap-proach uses all available data on any measured

(11)

variable, so it is a reasonable starting point for all further change analysis. The inclusion of all the cases, both complete and incomplete, al-lows us to examine the impact of attrition and possibly to correct for these biases. MAR results represent a convenient starting point, but many more techniques are available for dealing with incomplete data. We should try to measure the reasons why people do not participate, because nonrandom selection can create additional bi-ases (e.g., McArdle et al. 2005, McArdle & Bell 2000, Raudenbush 2001). We are able to ana-lyze all the data collected using these and other incomplete-data techniques.

STRUCTURAL EQUATION MODELS FOR INCLUDING LATENT COMMON FACTORS

The SEMs described above do not attempt to solve the potential problems of compounding measurement error in using change scores. To deal with these problems, we rely on multi-ple measurements of the same construct within each occasion. With multiple measures, we ﬁrst examine the hypotheses about common factors (McArdle 2007b, McDonald 1985, Meredith & Horn 2001), and we then expand these common factors into more complete mean and covari-ance structures.

Regression with Common Factors

The path diagram in Figure 4a represents a structural hypothesis for multivariate observa-tions (squaresX[t],Y[t],Z[t]) repeated over two occasions of measurement. Within each sion, we include a latent variable at each occa-sion (circlesf[1] andf [2]) with factor loadings (one-headed arrows labeled λm). The unique variation for each variable is also included (ψm2

as double-headed arrows). Following classical factor-analysis theory, unique factor scores are thought to be decomposable into two parts— one part that is speciﬁc to the test and rep-resents valid measurement, and a second part that is random error. We assume each unique factor contributes variation at a given time but

is independent of other scores within and across occasions (Meredith & Horn 2001).

Common factors are used to represent the testable hypothesis that a single unobserved variable can account for the covariation among the observed scores within each occasion. We next require the factor loading for each to be the same value at all time points—factor load-ing invariance (_λm[1] ₌ _λm[2]). This is a for-mal way to assert that this factor score has the same substantive meaning at each time of mea-surement. It is typical in SEM to introduce a regression in which common factors at later times are regressed on common factors at ear-lier times (βf). In theory, these factor scores

re-ﬂect only the common variance, and they do not contain measurement error. Thus, to the degree the invariant common-factor model is correct, this factor score regression represents the sta-bility of only the reliable components of our measures.

Latent Changes in Common-Factor Scores

In cases in which the factor loading invari-ance restrictions are reasonable, we can write an alternative form of the model. InFigure 4b, we introduce a third latent score (Δf ) that represents the latent change between the two common-factor scores. In SEM, we typically do not estimate the factor scores, so we cannot cal-culate this true change score directly. Instead, we follow the logic of the SEM in Figure 2b

and include a set of fixed unit-valued coeffi-cients (=1), so the second latent factor (f[2]) is defined as a simple sum of the other two (f[1]+Δf ). Because the latent change score (Δf ) now is part of the model, the model pa-rameters include the variation in latent changes across individuals (φΔ2) as well as covariation of change with the initial common factor (φ1Δ). As in the factor-regression model ofFigure 5a, these common factors do not include errors of measurement, and this variance in latent change score is not confounded by errors of measure-ment. When used in this way, this multivari-ate SEM avoids the classical problems of using

(12)

Figure 4

Alternative two-occasion structural equation models for multivariate data. (a) Common-factor regression, (b) common-factor latent change score, and (c) multiple-common-factors crossed-lagged regression.

(13)

inherently unreliable difference scores and the random errors cannot create regression to the mean (McArdle & Nesselroade 1994, Nessel-roade et al. 1980).

This SEM also allows us to test hypotheses about mean changes over time in the reliable common-factor scores. By including observed variable means, we can additionally estimate a latent level mean (θ1) and a latent change score

mean (θΔ). From this multivariate SEM, we can calculate a nonstandard repeated-measure t-test among the common-factor scores and ex-amine whether the mean of the latent change factor is zero (θΔ =θ2−θ1 =0?). A more

com-plete description of these tests would include intercepts for each variable (_νm), but these are not drawn here.

This new SEM offers a powerful way to an-swer questions typically asked by both the clas-sical ANOVA and factor analysis techniques. In MANOVA, we estimate the linear combi-nation weights that maximize the mean differ-ences over time using canonical variates, which do not attempt to account for the correlations within time or across residuals. In contrast, the SEM inFigure 4bprovides a highly structured approach for the repeated-measures ANOVA question. We are asking whether all the mean changes over time in this set of variables (W[t], X[t],Y[t]) are accounted for by mean changes in the common factors (f[t]). This is often exactly the question we want to answer.

Questions of Factorial Invariance Over Time

The search for factorial invariance over time is viewed by many as an empirical issue (Meredith & Horn 2001). As a first question, we typically ask whether the number of factors is equal over time. Assuming no substantial misfit, we can then ask questions about the invariance of all factor loadings over time: Does [1] = [2]? Other questions of factor equivalence over time can be asked, such as whether the person’s unobserved factor scores are equal over time (f [1]n = f[2]n). This is a more difficult

ques-tion that is examined indirectly by asking if the

factor means equal (θ[1] = θ[2]), if the factor variances equal (φ[1]2 ₌ _φ_[2]2_{), and if the}

fac-tors perfectly correlated (ρ[1,2] = 1). In re-peated measures data, it is also reasonable to add speciﬁc covariances for each measurement over time (ψmm; not drawn) to remove

addi-tional confounds.

Further relaxations of the factor invariance model (Figure 4a) can be tested and may ﬁt the data better, but the results may not be easy to interpret. In the absence of the same num-ber of factors, we would need to interpret each factor separately. In the absence of factor load-ing invariance, we cannot assert the same com-mon factors are measured at each measurement occasion. Although we might be interested in this kind of evidence for qualitative change, it is difﬁcult to go much further in SEM be-cause we do not have analytic tools to compare apples and oranges. Using repeated measures with the same number of factors and invariant-factor loadings allows us to say we have repeated constructs.

Because factor invariance is both practical and desirable, it seems appropriate to search for a metric invariant model of measurement until such a solution is found. In Figure 4c, we assume six variables are measured at each of two occasions and are indicators of two factors (g[t] andh[t]). Each pair of factor scores are as-sumed to be correlated with each other and with the Time 1 factor scores (as drawn here). Here the factor loadings are invariant over time, but the factor pattern within each time is complex. The pattern is simple for the ﬁrst two variables (U[t] andV[t] load ong[t]) as well as for the last two (Y[t] andZ[t] load onh[t]). But the invariant pattern is more complicated for the middle two variables, which have two loadings each (W[t] andX[t] load on bothg[t] andh[t]). Typically, a variable with multiple loadings does not con-tribute to the factorial description, but this is helpful because these multiple loadings are the same over time. So, although all variables do not exhibit a simple structure, a complex but invariant factor pattern may end up being more practically useful because it establishes the iden-tity of factors across occasions.

(14)

STRUCTURAL EQUATION MODELS USING

TIME-SERIES CONCEPTS

Many SEMs for repeated measures data come from the time-series literature (e.g., Browne & Nesselroade 2005, Nesselroade et al. 2001). These models typically do not deal with group averages, or even invariant common factors, but are based solely on time-to-time dependencies indicated by the covariance structures. Here we discuss several popular variations based on time-series regressions among invariant com-mon factors over time.

Crossed-Lagged Regression of Factors

The introduction of multiple constructs within each longitudinal occasion of measurement leads naturally to questions about time-dependent relationships among changes in these factors. A classical SEM for multiple fac-tors over time is based on a latent variable cross-lagged regression model (Gollob & Reichardt 1987, Rogosa 1979, Shadish et al. 2002). In

Figure 4c, we assumed that each common fac-tor inﬂuences itself over time with lagged auto-regressions (βg and βh) and that each factor

crosses over to inﬂuence the other factor at sub-sequent times (γgandγh).

This basic two-occasion two-factor model is used in an attempt to isolate the pattern of inﬂuences across the constructs over time. In-deed, this cross-lagged setup inspired the opti-mistic label of “causal modeling” for all SEMs (Bentler 1980; cf. McDonald 1985). However, for proper time-series causal inference, the vari-ances and covarivari-ances of the factors (φg2,φh2,

φgh) are required to be equal over time—these

restrictions imply the common factors have reached a stationary state or a point of equilib-rium. As with most other invariance hypothe-ses, these tests are ﬁtted to raw score covari-ances and not merely to correlations (Meredith & Horn 2001). These important tests require a complex set of model constraints, so they are often simply ignored (Browne & Nesselroade 2005).

The lagged coefﬁcients (βgandβh) provide

information about the general stability within each variable, and the crossed coefﬁcients (γg

and γh) give information about the impact of

one factor upon the other. If we can force one of these crossed coefﬁcients to zero without a large misﬁt (γg = 0), then we can say that this

factor (g[t]) is not a leading indicator over time of the other factor (h[t₊1]). It is also possible to fit a model in which both influences are zero (γg = γh = 0), and if this fits well, then we

can assert that the common factors do not in-ﬂuence one another. Except in rare cases (e.g., dyads), it is not reasonable to examine the ex-act equality of the processes (γg = γh) because

different common factors are not in the same scale of measurement. A simpler alternative that also needs to be rejected is that only one com-mon factor is needed over all times so there are no crossed effects at all (e.g.,Figure 4a).

Under this set of assumptions, any signif-icant cross-regression indicates a prediction over time independent of the outcome vari-able’s own history, and this is a classic defini-tion of a causal influence in observadefini-tional data (e.g., Hsiao 2003). It has recently been pointed out that the longitudinal cross-lagged coeffi-cient provides a reasonable test of a mediation hypothesis, and longitudinal data may be nec-essary for mediation theory (Cole & Maxwell 2003, MacKinnon et al. 2007). Nevertheless, the main problem with making any such causal assertions from longitudinal data is that they may be wrong, and we might not know it from our model fitting (Shadish et al. 2002). These inferences may be wrong because the variables have not reached equilibrium, or other variables are missing from the model that alter the influ-ences, or these common factors are not invari-ant, and so on. These are not easy problems to overcome in longitudinal observational data.

It may be useful to point out that the la-tent change score model (Figure 4b) can be ex-tended for use with multiple constructs with sets of latent changes (Δg andΔh). These change equations can directly represent the parame-ters of most interest, but they have the misfor-tune of appearing far more complicated than

(15)

the cross-lagged model so they are not drawn here. Still, these SEMs can directly represent common questions, such as whether a change inXproduces a change inY, by turning a pas-sive factor covariance (i.e.,Δg→Δh) into an ac-tive regression of factor changes (i.e.,Δg→Δh). We must consider if this question might be best represented using a model with regres-sions of changes on levels (as inFigure 2c), and we return to this issue below. In either case, it is certainly possible to specify some additional change questions as formal hypotheses.

The overall benefit of any crossed-lagged SEM comes when the model alternatives are clear and testable or when they can suggest a need for the collection of additional data. Pat-terns of causal influences are more complex and are not so easy to test, especially shorter-term feedback loops, different patterns of causal in-fluence at different times, or different influ-ences for subgroups of persons. In many cases, we can use SEM to ensure that these tests are meaningful statements of the hypotheses, but these models certainly cannot deal with all threats to the validity of all time-based causal assertions (see Shadish et al. 2002).

Extending Time-Series Factor Models to Multiple Occasions

We next consider more than two occasions of repeated measures data. Let’s assume the common-factor scores at a current time (f [t]) are fully predicted by scores on the same factor at an earlier time (f [t−1]) with the classical in-clusion of a disturbance term (z[t]). This type of model is presented inFigure 5afor four time points with invariant factor loadings ( ) for the observed variables (Y[t]). As in most time-series models, the means are not usually restricted, so neither intercepts nor changes in the means are considered here.

In classical time-series analysis, the speciﬁc time of observation does not matter, but we do focus on the interval of time (Δt) between ob-servations. For this reason, we only draw one regression weight (β) for a speciﬁc unit of time, and one disturbance variance (ω2), and we

as-Figure 5

Alternative multiple-occasion structural equation models based on time-series concepts. (a) Quasi-Markov simplex with one common factor and (b) cross-lagged regression over many occasions.

sume this is invariant (stationary) over time. This is a highly restricted structure over time— a Markov simplex—wherein each covariance is a function of these parameters and the time in-terval (for occasions j and k, σjk = φ12 βk−j).

As a start, we assume that the current scores are based only on immediately past behaviors, and this is a testable hypothesis. In early work on this topic, it was shown how all the errors of measurement could be separately estimated from observed variables with only a few occa-sions of measurement (i.e.,T = 4). In later re-search, models with more-complete common factors were included (f[t] in a quasi-Markov simplex; J ¨oreskog & S ¨orbom 1979). Assuming

(16)

the common-factor loadings are invariant over time, we can test a broad hypothesis of equilib-rium by asking if there is equality (i.e., station-arity) of the common-factor covariances within each time.

This time-series framework leads to a highly restricted covariance structure, so the simple auto-regressive factor model ofFigure 5adoes not always provide a good ﬁt to the data, and more complexity may be needed (Nesselroade et al. 2001). Many alternatives can be consid-ered at this point, including the introduction of more predictors from earlier times (i.e., us-ing multiple back-shifts or lagsf[t−2],f [t−3], etc.), and other concepts about correlations of nearest points (i.e., latent moving average terms; Browne & Nesselroade 2005). However, it is clear that most SEM researchers avoid these issues and simply allow some or all of these pa-rameters to vary, especially the auto-regressions (i.e., β[1], β[2], etc.) and disturbances (i.e., ω[1]2_,_ω_[2]2_{, etc.). In real longitudinal data}

col-lections, the time period sampled may span dif-ferent causal systems, and the crossed-lagged coefﬁcients across time may need to vary, but this may lead to a more complex causal inter-pretation (Gollob & Reichardt 1987). This is another case in which standard SEM programs can be used to estimate parameters outside the bounds of the usual time-series interpretation.

Extending Cross-Lagged Factor Regression to Multiple Occasions

InFigure 5b, we presume each common factor is predicted by itself with lagged regressions (β) and by the other common factor with crossed regressions (γ). This simpliﬁed model extends the formal basis of cross-lagged common fac-tors (Figure 4c) to many more occasions (T>2). As a starting point, the effects over time are only included for factor scores at the imme-diately preceding time point. This framework allows us to evaluate whether any variable (g[t]) is an outcome of both itself at an earlier time (g[t−1]) and also an outcome of a different vari-able (f [t−1]) at an earlier time. As stated above, we may need to consider more complex models

that include effects from other time lags (e.g., t₋2, t₋3). In multivariate time-series analysis, we assume all common factors have reached a state of equilibrium, and we assume the pattern of causal influences is identical over equal dis-tances in time. The invariance constraints of the factor loadings and the cross-lagged coefficients lead to a simplicity that requires empirical eval-uation, but these simplifications can be effective when dealing with a lot of data.

This multi-occasion cross-lagged factors model provides a rigorous way to evaluate whether the phenomena under study are linked in a stationary (i.e., nonevolving) process. In practice, this model is often applied in ob-servational panel studies without a time-series foundation, and a multitude of additional co-efficients are needed for each occasion (e.g., Cillessen & Mayeux 2004). These kinds of lon-gitudinal analyses might isolate causal/control features among the factors, but the resulting ef-fects must be further studied in more rigorously controlled experiments before we could be cer-tain about the true causal influences (Shadish et al. 2002). Of course, this is not to imply that randomization solves all problems—a random-ized treatment may affect any model parameter, so group differences in the cross-lagged coeffi-cients may be key (McArdle 2007a).

LATENT-CURVE CONCEPTS

The popular time-series models do not deal with the group averages over time, but previous SEM research has considered many models that include both means and covariances (Harris 1963, Horn 1972, Horn & McArdle 1980, J öreskog 1973, J öreskog & S örbom 1979). In a novel and comprehensive approach to this problem, Meredith & Tisak (1990) demon-strated how classical growth-curve models could be represented and fitted using a standard SEM based on restricted common factors for means and covariances. These representations of latent-curve SEMs were critical because they offered a wide range of alternatives to

(17)

stationarity and equal-interval assumptions. This new SEM approach quickly spawned methodological and substantive applications (e.g., Bollen & Curran 2006; Duncan et al. 2006; McArdle 1986, 2001).

Latent Growth-Curve Models

One form of the latent-curve model is depicted inFigure 6a. In this model, we assume that each set of observed variables (Y[t]) reﬂects a set of invariant common factors (f [t]) separated from unique factors. Here, the common factors are organized to have three unobserved or latent scores: (a) a latent intercept or initial level (fi ),

(b) a latent slope (fs) representing the change

over time, and (c) a time-specific independent state (z[t]). To indicate the average changes over time, we define a set of group coefficients or basis weights (e.g., slope loadingsα[t]) based on time since some event (e.g., time since surgery, time since birth). These model parameters are used to form the shape of the trajectory over occasions.

In path diagrams such as Figure 6a, the level and slope are often assumed to be random variables with ﬁxed means (θi,θs) with random

variances (φi2,φs2) and covariance (φis). We

as-sume there is a within-time-state variance (ω2₎

common to all observed measures (Horn 1972, McArdle & Woodcock 1997) and one unique variance (ψm2) speciﬁc to each measure. For

simplicity, these variance terms are assumed to be invariant over time and uncorrelated with all other components, but we recognize these re-strictions may not be appropriate for real data. This type of path diagram is a direct trans-lation of the average cross-products matrix al-gebra used to estimate these models (Grimm & McArdle 2005). This inclusion of the basis co-efﬁcients (α[t]) in this way means these parame-ters are shared in all model restrictions. Specif-ically, this includes a proportional relationship over time for the means (i.e., μ[t] = θi+θs

α[t]) and the standard deviations (σ[t], includ-ingα[t]), and a more complex relationship with the over-time correlations (ρ[t,t+1], including α[t] andα[t+1]). Notice that the mean

struc-Figure 6

Alternative multiple-occasion structural equation models based on latent-growth concepts. (a) Latent-curve model for one common factor and (b) bivariate latent-curve model for two common factors.

ture is formed as in an ANOVA, but the covari-ance structure lies in between the restrictions of ANOVA and the unrestricted MANOVA. This is important because the inclusion of appropri-ate restrictions on the covariance structure from the latent-curve model increases the statistical power of tests of mean differences (Muller & Stewart 2006, Muth´en & Curran 1997).

Fitting Latent-Curve Hypotheses

Different organizations of the basis parame-ters represent speciﬁc hypothesis to be tested.

(18)

For example, if the basis is set to zero (α[t] = 0), this eliminates the slope impacts and pro-duces a level-only model with equal means and a compound symmetry structure. If we fix the basis to be the specific time of measurement (α[t] = t−1), we can represent a straight-line or straight-linear growth curve with a more com-plex shared parameter structure. Other popular nonlinear models include polynomial models (quadratic, cubic) and exponential forms (e.g., Ghisletta & McArdle 2001, Grimm et al. 2007). Although not as popular, we can also estimate latent-basis coefficients as we do any other set of factor loading where, because these are es-sentially factors of time, this leads to an esti-mate of an optimal shape for the group curve and individual differences (i.e., McArdle 1986, 1989; McArdle & Bell 2000; Meredith & Tisak 1990).

This latent-curve model is often expanded into what is popularly known as a multilevel, hierarchical, or random coefﬁcient form. From an SEM perspective, we simply add a group-regression model that follows our use of group coding described above (Figure 3a). Here the predictors are group codes (G) or covariates (X) and the outcomes are the latent levels (fi ) and

latent slopes (fs). Because these outcomes are

latent levels and latent slopes, this is termed a “second-level equation.” There are some mi-nor points of disagreement about exactly which random-coefﬁcients models can and cannot be ﬁt using standard SEM software (Cudeck & Harring 2007), but newer SEM software of-fers an effective way to deal with most practical problems (e.g., Ferrer et al. 2004).

We can compare the latent-curve model to standard ANOVA approaches. MANOVA makes no explicit provision for the structure of the covariances or even for uncorrelated resid-uals, so the otherwise comparable MANOVA tests (of linearity, etc.) require far more param-eters and can be expected to yield far less power. This is not to suggest that SEM is best used with very small samples, but it does suggest that estimating a minimal number of parame-ters is a powerful idea. It is also well known that the standard MANOVA equations can be

dif-ﬁcult to use with incomplete data (Bock 1975, Hedecker & Gibbons 2006). The additional re-quirement of homogeneity of the covariances over groups, a test often ignored in practice, may be more realistic if this test is formed as an SEM with latent-curve invariance over groups.

Considering Multiple Latent Curves

Assuming we have two common factors mea-sured at multiple times, we can ﬁt what appears to be an entirely different model—a model based on multiple latent curves (McArdle 1989). One popular version of this model is displayed inFigure 6b. In this model, we assume that each series is based on its own latent curve, with unlabelled arrows but different shapes (αf[t],

αg[t]) and with different parameters for the

re-spective levels and slopes. However, the new information in this bivariate model comes from the cross-covariances of the levels and slopes. This model has recently become a popular way to represent a parallel-growth process (Bollen & Curran 2006, Duncan et al. 2006, Singer & Willett 2003). Of interest here is the correla-tion of the two latent slopes (φfs,gs), this is an

error-free index of simultaneous changes across different variables (fsandgs). Given other

re-strictions (φﬁ,fs = 0, φgi,gs = 0), we can test

the hypothesis of “no connection in changes” among the factor scores (φfs,gs = 0; Hertzog

et al. 2006).

A problem of inference emerges when the direct test of correlated slopes is interpreted as the test of a dynamic impact (e.g., as in McArdle 1989; cf., MacCallum et al. 1997). The corre-lation of latent changes across different vari-ables does not change over time and it does not represent a directional dynamic hypothe-sis. In the hope of obtaining time-dependent dynamic information, some researchers have tried to substitute a cross-lagged regression of the latent slopes on the latent levels (e.g., Bollen & Curran 2006, Singer & Willett 2003, Snyder et al. 2003). Although this model of-fers new latent-variable parameters, the only reasonable situation for this change regression comes when the latent levels are known to

(19)

precede the latent slopes. That is, this levels→ slopes model may be useful in experimental sit-uations when there is a similar starting point for all subjects, but these same model parameters may be quite arbitrary with most observational data.

A related form of the latent-curve model with widespread usage in epidemiology and bio-statistics is the time-varying covariate model adapted from work on survival analysis regres-sion (Cox 1972). When applied in SEM (e.g., Bollen & Curran 2006), one of the variables (X[t]) is thought to be responsible for some part of the curvature of the other (Y[t]), so its inﬂuence is removed from the outcome scores within each time (or at lagged times). These time-varying covariate models are rel-atively easy to implement using existing com-puter software (e.g., Mplus, MIXED), so they are rapidly growing in popularity. Although co-variate adjustment may be needed, this time-varying covariate approach is designed to re-move all impacts, and this approach may not tell us much about the dynamic interplay among variables.

LATENT-CHANGE CONCEPTS

In any data analysis problem where multiple constructs have been measured at multiple oc-casions, we need to consider the importance of causal sequences and determinants of changes (Nesselroade & Baltes 1979). The goal of eval-uating time-based sequences, especially when things are changing, is one of the main reasons for collecting longitudinal repeated-measure data in the first place. We have pointed out above the useful benefits of the classical mod-els, but we have also seen that each is limited to specific forms of dynamic inference. Of course, the statistical evaluation of dynamic sequences is not an easy problem, and these problems have puzzled researchers for decades. We describe below how the prior SEMs lead directly to new SEMs that can provide a more flexible frame-work for causal-dynamic questions.

Mixing Models for Means and Covariances

The time-series and latent-curve models dis-cussed in the previous two sections are not identical, but they can be fit to the same re-peated measures data. The distinguishing fea-ture of time-series factor models (Figure 5a) is the use of time of measurement as a guide to organize the predictive regressions—i.e., mov-ing forward in time. In contrast, in the latent-curve model (Figure 6a), we use the data at any time to define group curves and individ-ual differences around a trajectory, so time-to-time predictions are not essential. Typically, these models do not use the same parameters, so they cannot be directly compared using stan-dard goodness-of-fit tests. This has led some researchers to use both types of models with the same data, and the use of a multiple-model strategy often seems sensible in practice (e.g., Cillessen & Mayeux 2004).

Other researchers have tried to combine aspects of these models. In recent statistical work, ANOVA researchers have recognized that when the standard models do not ﬁt well enough, a variety of built-in covariance assump-tions can be added that are not strictly con-nected to the hypotheses about the means (e.g., AR[t]; for details, see Muller & Stewart 2006). A similar strategy was initially suggested for SEM ( J ¨oreskog 1973), and it became easy to follow this lead and simply paste the two diagrams (in

Figure 5aandFigure 6b) together as a com-posite model in the hopes achieving better ﬁt (e.g., Curran & Bollen 2001, Horn & McArdle 1980). This composite strategy should end up with a better-ﬁtting model, but the estimated parameters may still only be interpreted as sep-arate parts.

A different way to approach this problem is to examine the speciﬁc theory generating the expectations. One common feature of contem-porary repeated measures SEMs is that we are deﬁning a trajectory over time (or an integral) in the scores, and the changes are implied us-ing some difference (or differential) operator. However, if we look back to the classic literature

(20)

on growth-curve analysis, the derivative (differ-ence) was typically deﬁned at the start, and this model of change then led to the expected inte-gral (or trajectory) for the outcome of interest (Boker 2001, McArdle & Nesselroade 2003). Repeated measures SEMs can now be consid-ered in this same way.

Latent Change Score Models

Figure 7ais a path diagram based on this con-cept of multiple latent change scores over time. Once again, we start with the separation of

com-Figure 7

Alternative multiple-occasion structural equation models based on latent change concepts. (a) Latent change score model for one common factor and (b) bivariate latent change score model for two common factors.

mon from unique factors using invariant factor loadings ( ). For single variables, this deﬁni-tion is similar to separating the latent or true score from the random error of measurement. We next follow the latent change score concept and consider each common-factor score (f[t]) to be the sum of the immediately previous factor score (f[t−1]) and some unobserved or latent change score (_Δf [t]). If we then repeat this pro-cess for each time point, we add a layer of (t−1) new latent change scores to the model.

This approach is a natural generalization of the previous models in which difference scores are included as unobserved variables (Figures 2b, 4b, 5b). Figure 7a includes la-tent change scores (_Δf [t]) at each occasion (after Time 1), and we assume these latent vari-ables are equidistant in time (Δt = 1) even if the observed scores are not. That is, the ob-served data may be unbalanced (Hamagami & McArdle 2000, 2007). This definition of an equal-interval latent time scale is nontrivial be-cause it allows us to eliminate the time lag (_Δt) from all equations. The use of many fixed unit coefficients, a deceptively simple algebraic de-vice, allows us to start with a change equation and then define any trajectory equation. That is, we do not directly define auto-regressions (β[t]) or slope coefficients (α[t]). Instead, we directly define the model of change and indirectly create overtime expectations from the accumulation of latent changes among latent variables.

This latent change equation produces many unusual-looking path diagrams (Figure 7a; see Collins & Sayer 2001), but because we have in-cluded all model parameters, the standard path-tracing rules of expectations remain intact. In these tracings, any change that occurs earlier accumulates and is expressed in the later occa-sions. The first term in the accumulation pro-cess may be traced in the diagram by starting at the first change score (Δf[2]). This change does not affect the prior score (f [1]), but it does influence the second time directly (by the fixed 1), and it is an indirect part of all the other la-tent common factors through the sequence of fixed-unit values (fromf [t] tof [t+1]). Simi-larly, the next change score (Δf[3]) does not

(21)

affect prior times, but it is a part of all fu-ture times. This sequence is used to form a set of expectations for the means, variances, and covariances over time, but potentially com-plex expectations are automatically generated using any standard SEM software (Grimm & McArdle 2005; Hamagami & McArdle 2000, 2007; McArdle 2001).

This approach to latent change scores can represent all difference and change concepts from the models discussed above. For example, the latent intercept term (fi) has effects along

the one-headed arrows (fromf [t] to f[t+1]), so the intercept mean (θi) and variance (φi2)

are part of the expected value of every time point. Next we add a latent slope score (fs) with

loadings (α[t]) and with a mean (θs). The latent

slope is not connected to the ﬁrst factor score (f [1]), but it affects the changes (Δf[t]), and this inﬂuence is accumulated over subsequent time points. We can also include a prediction of the latent change score (Δf[t]) as a linear function (β) of the factor score at the previous time (f[t−1]), plus a state residual (z[t]), and these effects are multiplied over time. Because of the common-factor model separation, the er-ror variances (ψm2) are assumed to be constant

over time and are not part of this accumulation. The resulting model inFigure 7ais termed a dual-change score model. In this expression of change, we permit both a systematic con-stant change (α) from the linear slope and a sys-tematic proportional change (β) over time. As a simple start, these change coefﬁcients (α,β) can be considered invariant over time (e.g., er-godic). The invariance of dynamic parameters does not mean the expectations are constant— the latent scores can grow and change—but it does mean that the expectations accumu-late in a systematic fashion. This simple linear-difference model with multiple control param-eters leads to a nonlinear growth trajectory from the accumulation of latent changes and comes from a family of curves based on lin-ear and exponential trajectories (Ghisletta & McArdle 2001, Grimm et al. 2007, Hamagami & McArdle 2007).

Multiple Latent Change Score Models Figure 7b is a latent change score model for two common factors. Here we draw the dual-change model for each set of observed variables (Y[t] and X[t]) in terms of their common fac-tors (f[t] andg[t]). We assume the sets of ob-served scores are measured over a deﬁned inter-val of time and the latent variables are deﬁned over an equal interval of time (Δt =1), and we add layers of latent difference scores (Δf [t] and

Δg[t]). This model includes the use of fixed-unit values (unlabeled arrows) to define pairs of latent changes (Δf [t] andΔg[t]), and equality (invariance) constraints over time within a fac-tor (for theα,β, andγparameters) to simplify estimation and identification. Most critically, in this model a coupling parameter (γf)

rep-resents the time-dependent effect of one con-struct (g[t−1]) on the subsequent change in the other (Δf [t]). We can include both directions (γf andγg) and consider many different SEMs

for multiple latent changes.

This new change model subsumes all aspects of the previous models as special cases to be tested. We can ﬁt the standard cross-lagged fac-tor model (Figure 5b) by eliminating the latent intercepts and the latent slopes. The standard cross-lagged models do not allow for systematic growth components, but this is now a testable feature of this change model. To obtain a bivari-ate lbivari-atent curve (Figure 6b), we eliminate both the autoregressive (βf = βg = 0) and

cou-pling parameters (γf = γg = 0). The

bivari-ate lbivari-atent growth models may represent parallel latent processes, even including regressions of slopes on levels, but they do not allow for cross-lagged dynamic coupling of the key factors over time, so such a simple model may not capture the systematic changes or ﬁt the data.

The inclusion of latent changes in a bivari-ate model allows a variety of dynamic models to be tested using the standard SEM statistical approach. These bivariate trajectories can be complex, but they are automatically created as a linear accumulation of ﬁrst differences for each variable by standard SEM programs. These bivariate latent change score models