• No results found

Random Effects Models for Longitudinal Survey Data

N/A
N/A
Protected

Academic year: 2022

Share "Random Effects Models for Longitudinal Survey Data"

Copied!
15
0
0

Loading.... (view fulltext now)

Full text

(1)

CHAPTER 14

Random Effects Models for Longitudinal Survey Data

C. J. Skinner and D. J. Holmes

14.1. INTRODUCTION introduction

Random effects models have a number of important uses in the analysis of longitudinal survey data. The main use, which we shall focus on in this chapter, is in the study of individual-level dynamics. Random effects models enable variation in individual responses to be decomposed into variation between the

`permanent' characteristics of individuals and temporal `transitory' variation within individuals.

Another important use of random effects models in the analysis of longitu- dinal data is in allowing for the effects of time-constant unobserved covariates in regression models (e.g. Solon, 1989; Hsiao, 1986; Baltagi, 2001). Failure to allow for these unobserved covariates in regression analysis of cross-sectional survey data may lead to inconsistent estimation of regression coefficients.

Consistent estimation may, however, be achievable with the use of random effects models and longitudinal data.

A `typical' random effects model may be conceived of as follows. It is supposed that a response variable Y is measured at each of a number of successive waves of the survey. The measurement for individual i at wave t is denoted yit and this value is assumed to be generated in two stages. First,

`permanent' random effects yi are generated from some distribution for each individual i. Then, at each wave, yit is generated from yi. In the simplest case this generation follows the same process independently at each wave. For example, we may have

yi N y, sÿ 21

, yitj yi N yÿ i, s22

: (14:1)

Under this model, longitudinal data enable the `cross-sectional' variance s21‡ s22 of yit to be decomposed into the variance s21 of the `permanent' component yiand the variance s22 of the `transitory' component at each wave.

Analysis of Survey Data. Edited by R. L. Chambers and C. J. Skinner Copyright2003 John Wiley & Sons, Ltd.

ISBN: 0-471-89987-9

(2)

This may aid understanding of the mobility of individuals over time in terms of their place in the distribution of the response variable. An example, which we shall focus on, is where the response variable is earnings, subject to a log transformation, and a model of the form (14.1) enables us to study the degree of mobility of an individual's place in the earnings distribution (e.g. Lillard and Willis, 1978).

Rich classes of random effects models for longitudinal data have been de- veloped for the above purposes. A number of different terms have been used to describe these models including variance component models, error component models, mixed effects models, multilevel models and hierarchical models (Baltagi, 2001; Hsiao, 1986; Diggle, Liang and Zeger, 1994; Goldstein, 1995).

The general aim of this chapter is to consider how to take account of complex sampling designs in the fitting of such random effects models. We shall suppose that there is a known probability sampling scheme employed to select the sample of individuals followed over the waves of the survey. Two additional complications will be that there may be wave nonresponse, so that not all sampled individuals will respond at each wave, and that the target population of individuals may not be fixed over time.

To provide a specific focus, we will consider data on earnings of male employees over the first five waves of the British Household Panel Survey (BHPS), that is over the period 1991±5. As a basic model for the log earnings yit of individual i at wave t…ˆ 1, . . . , T† we shall suppose that

yit ˆ bt‡ ui‡ nit, t ˆ 1, . . . , T (14:2) where the random effect uiis the `permanent' random effect, referred to earlier, and the nitare transitory random effects, whose effects on the response variable may last beyond the current wave t via the first-order autoregressive (AR(1)) model:

nitˆ rnitÿ1‡ "it, t ˆ 1, . . . , T: (14:3) Both ui and nit may include the effects of measurement errors (Abowd and Card, 1989). The random variables ui and "it are assumed to be mutually independent with

E(ui) ˆ E("it) ˆ 0, var(ui) ˆ s2u, var("it) ˆ s2":

The unknown fixed parameters bt(t ˆ 1, . . . , T ) represent annual (inflation) effects. Lillard and Willis (1978) considered this model (amongst others) for log-earnings for seven years (1967±73) of data from the US Panel Study of Income Dynamics. Letting s2nˆ var(nit) and assuming the "it and nitare mutu- ally independent and stationary, we obtain

s2nˆ s2"= 1 ÿ rÿ 2

: (14:4)

We refer to the above model as Model B and to the more restricted `variance components' model in which r ˆ 0 as Model A. See Goldstein, Healy and Rasbash (1994) for further discussion of such models.

(3)

We shall consider two broad approaches to fitting these models under a complex sample design. The first is a covariance structure approach, following Chamberlain (1982) and Skinner, Holt and Smith (1989, section 3.4.5, hence- forth referred to SHS), in which the observations on the T waves are treated as a multivariate outcome with individuals as `single-level' units. This approach is set out in Section 14.2. The second approach treats the data as two-level (Goldstein, 1995) with the level 1 units as the waves t ˆ 1, . . . , T and the level 2 units as the individuals i. The aim is to apply the methods developed by Pfeffermann et al.

(1998). This approach is set out in Section 14.3. A related approach is developed by Feder, Nathan and Pfeffermann (2000) for a model with time-varying random effects. The application of both our approaches to earnings data from the British Household Panel Survey will be considered in Section 14.4.

14.2. A COVARIANCE STRUCTURE APPROACH a covariance structure approach

Following the notation in Section 14.1, let yiˆ ( yi1, . . . , yiT)0 be the T  1 vector representing the profile of values of individual i over the T waves of the survey. Under the model defined by (14.2)±(14.4), these multivariate out- comes are independent with mean vector and covariance matrix given respect- ively by

E( yi) ˆ b ˆ ( b1, . . . , bT)0, (14:5) var( yi) ˆ s2uJT‡ s2nVT( r), (14:6) where JT is the T  T matrix of ones and VT( r) is the T  T matrix with the (tt0)th element given by r(t0ÿt)(1  t  t0 T).

These equations define a `covariance structure' model in which the mean vector is unconstrained but the k ˆ T(T ‡ 1)=2 distinct elements of the covar- iance matrix are constrained to be functions of the parameter vector y ˆ (s2u, s2", r)0. Inference about these parameters may follow the approach outlined in SHS (section 3.4.5).

Assuming first no nonresponse, let the data consist of the values yifor units i in a sample s. The usual survey estimator of the finite population covariance matrix S is given by

^S ˆX

s

wi( yiÿ y)( yiÿ y)0=X

s

wi, (14:7)

where

y ˆX

s

wiyi=X

s

wi,

and where wiis the survey weight for individual i. Let ^A ˆ vech( ^S ) denote the k  1 vector of distinct elements of ^S (the `vector half' of ^S: see Fuller, 1987, p. 382) and let A(y) ˆ vech[var( yi)] denote the corresponding vector of elem- ents of var( yi) from (14.6).

A COVARIANCE STRUCTURE APPROACH 207

(4)

Following Chamberlain (1982), Fuller (1984) and SHS (section 3.4.5), a general class of estimators of y is obtained by minimising

A ÿ A(y)^

h i0

Vÿ1hA ÿ A(y)^ i

(14:8) where V is a given k  k non-singular matrix. A generalised least squares (GLS) estimator ^yGLS is obtained by taking V to be a consistent estimator of the covariance matrix of ^A. One choice, Vc, is obtained from the linearisation method (Wolter, 1985) by approximating the covariance matrix of the elements of ^S by the covariance matrix of the corresponding elements of the linear statistic

X

s

zi, (14:9)

where ziˆ wi[( yiÿ y)( yiÿ y)0ÿ ^S ]=P

swi is treated as a fixed variable. The estimator Vc may allow in the usual way for the complex design (Wolter, 1985).

Since A(y) is a non-linear function of y, iterative minimisation of (14.8) is required. It may be noted that, for a given value of r under Model B, A(y) is linear in (s2u, s2n) and so closed form expressions may be determined for the values ^s2u( r) and ^s2n( r), which minimise (14.8) for given r. The iterative minimisation may thus be reduced to a scalar problem. A consistent estimator of the covariance matrix of ^yGLSis given by (Fuller, 1984)

VLÿ^yGLS

ˆ _A ^yÿ GLS0

Vcÿ1A ^y_ÿ GLS

h iÿ1

, (14:10)

where _A(y) ˆ ]A(y)=]y.

An advantage of the GLS approach is that it provides a ready-made good- ness-of-fit test as the minimised value of the criterion in (14.8), namely the Wald statistic:

XW2 ˆ ^hA ÿ A ^yÿ GLSi0

Vcÿ1hA ÿ A ^y^ ÿ GLSi

: (14:11)

If the model is correct and if the sample is large enough for Vcto be a good approximation to the covariance matrix of ^A, then XW2 should be distributed approximately as chi-squared with k ÿ q degrees of freedom, where q ˆ 2 and 3 for Models A and B respectively.

One potential problem with the GLS estimator is that the covariance matrix estimator may be unstable if it is based on a relatively small number of degrees of freedom. This may lead to departures from the null distribution of XW2 assumed above. In this case, it may be preferable to consider alternative choices of V. One approach is to let V be an estimator of the covariance matrix of A based upon the (false) assumption that observations are independent and^ identically distributed. Thus, if we write

A ˆ^ X

s

ai, (14:12)

(5)

where aiˆ vech(zi) denotes the k  1 vector of distinct elements of zi, then we may set V equal to

Viid ˆ nX

s

(aiÿ a)(aiÿ a)0=(n ÿ 1), (14:13)

where a ˆP

sai=n and n denotes the sample size. Although Viid may be more stable than a variance estimator which allows for the complex design, this choice of V is still correlated with ^A and, as discussed by Altonji and Segal (1996), may lead to serious bias in the estimation of y. To avoid this problem, an even simpler approach is to set V equal to the identity matrix, when the estimator of y obtained by minimising (14.8) may be viewed as an ordinary least squares (OLS) estimator. In both the cases when V ˆ Viid and when V is the identity matrix, the resulting estimator ^y will still be consistent for y but the Wald statistic XW2 will no longer follow a chi-squared distribution if the model is true. The large-sample distribution will instead be a mixture of chi-squared distributions and this may be approximated by a chi-squared distribution using one or two moment Rao±Scott approximations (SHS, Ch. 4). It is also no longer appropriate to use expression (14.10) to obtain standard errors for the elements of ^y. Instead, as noted in SHS (Ch. 3), a consistent estimator of the covariance matrix of ^y is

V ^y; V ˆ Vÿ 0

ˆ [ _A(^y)0V0ÿ1A(^y)]_ ÿ1[ _A(^y)0V0ÿ1VcV0ÿ1A(^y)][ __ A(^y)0V0ÿ1A(^y)]_ ÿ1, where V0 is the specified choice of V (Viid or the identity matrix) used to determine ^y and Vc is a consistent estimator of the covariance matrix of A under the complex design. Note that this expression reduces to (14.10)^ when V0ˆ Vc.

The approach considered so far in this section is based on the estimated covariance matrix ^S in (14.7) and assumes no nonresponse. This is an unrealis- tic assumption. The simplest way of handling nonresponse is to consider only those individuals who respond on all T waves, the so-called `attrition sample', sT, at wave T. For longitudinal surveys, designed for multipurpose longitudinal analyses, it is common to construct longitudinal weights wit at each wave t, which are appropriate for longitudinal analysis based upon data for the attri- tion sample st of individuals who respond up to wave t (Lepkowski, 1989).

Thus, the simplest approach is to use only data from attrition sample sTand to replace the weights wi, e.g. in (14.7), by the weights wiT.

A more sophisticated approach, aimed at producing more efficient estimates, uses data from all attrition samples s1, . . . , sT. A recursive approach to the estimation of the covariance matrix of yi may then be developed. Let y(t)i ˆ ( yi1, . . . , yit)0 and let ^S(t) denote the estimated t  t covariance matrix of y(t)i . Begin the recursion by setting

^S(1)ˆX

s1

wi1( yi1ÿ y1)2=X

s1

wi1,

A COVARIANCE STRUCTURE APPROACH 209

(6)

where

y1ˆX

s1

wi1yi1=X

s1

wi1

as in (14.7). At the tth step of the recursion (t ˆ 2, . . . , T ) set the (t ÿ 1)  (t ÿ 1) submatrix of ^S(t) corresponding to y(tÿ1)i equal to ^S(tÿ1). Let b(t) be the vector of weighted regression coefficients of yit on y(tÿ1)i given by

b(t) ˆ X

st

wit( y(tÿ1)i ÿ y(tÿ1))( y(tÿ1)i ÿ y(tÿ1))0

" #ÿ1

X

st

wit( y(tÿ1)i ÿ y(tÿ1))yit

where

y(tÿ1)ˆX

st

wity(tÿ1)i =X

st

wit:

Then set the (tt)th element of ^S(t), corresponding to the variance of yit, equal to

^s2et‡ b(t)0^S(tÿ1)b(t), where

^s2etˆX

st

wit(eitÿ et)2=X

st

wit, eit ˆ yitÿ y(tÿ1)i 0b(t), etˆX

st

witeit=X

st

wit: Finally, let ^St(t),tÿ1 denote the 1  (t ÿ 1) vector of remaining elements of ^S(t) corresponding to the covariances between yitand y(tÿ1)i and let

^St(t),tÿ1ˆ b(t)0S^(tÿ1):

The recursive process is repeated for t ˆ 2, . . . , T. If yi is multivariate normal and there are no weights the resulting ^S(t) is a maximum likelihood estimator (Holt, Smith and Winter, 1980) for data from the set of attrition samples. In general, the estimator may be viewed as a form of pseudo-likelihood estimator (see Chapter 2). If the weights do not vary greatly, if yi is approximately multivariate normal and the observations for most individuals fall into one of the attrition samples, the estimator ^S(t) may be expected to be fairly efficient.

Weighting can become unwieldy if it is attempted to adjust for all possible wave nonresponse patterns in addition to the attrition samples. See, for example, Lepkowski (1989) for further discussion. For a more general discussion of inference in the presence of nonresponse see Chapter 18. We return in Section 14.4 to the application of the methods discussed in this section.

14.3. A MULTILEVEL MODELLING APPROACH a multilevel modelling approach

A second approach to handling complex survey designs in the fitting of the models defined in Section 14.1 is by adapting standard approaches, such as iterative generalised least squares (IGLS), used for fitting random effects models (Goldstein, 1995). Pfeffermann et al. (1998) have considered modifying

(7)

IGLS estimation using an approach analogous to the pseudo-likelihood method (see Chapter 2) for a model of the form (14.2), where the vit are not serially correlated. Here we consider the extension of their approach to a longitudinal context, allowing for serial correlation. A potential advantage of this approach is that covariates may be handled more directly in the model. A potential disadvantage is that goodness-of-fit tests are not generated so directly.

In multilevel modelling terminology (Goldstein, 1995), the individuals are the level 2 units and the repeated measurements at the different waves represent level 1 units. Pfeffermann et al. (1998) allow for a two-stage sampling scheme, whereby the level 2 units i are selected with inclusion probabilities pi and the level 1 units t with inclusion probabilities ptjiconditional on level 2 unit i being selected. Weights wi and wtji are then constructed equal to the reciprocals of these respective probabilities, which are assumed known. To adapt this ap- proach to our context of longitudinal surveys subject to wave nonresponse, it seems natural to let pi denote the probability that individual i is sampled and ptji the probability that this individual responds at wave t. While we may reasonably suppose that the piare known, it is not straightforward to estimate the ptji for general patterns of wave nonresponse (as noted in the covariance structure approach of Section 14.2). We therefore restrict attention to estima- tion using only the data derived from the attrition samples st. As noted in Section 14.2, it is common for longitudinal weights wit to be available for use with these attrition samples and we shall suppose here that these approximate (piptji)ÿ1. We may then set wiequal to the design weight pÿ1i and wtji equal to wit=wi. Alternatively, given wi1, . . . , wiT, we may set wiˆ wi1 and wtji ˆ wit=wi1 (t ˆ 1 . . . T). Note, in particular, that in this case w1jiˆ 1 for all i. This approach treats the sample selection and the response process at the first wave as a common selection process. In the approach of Pfeffermann et al. (1998), correction for bias by weighting tends to be more difficult at level 1 than at level 2, because there tends to be more non-linearity in the IGLS estimator as a function of level 1 sums than of level 2 sums. Hence setting wiˆ wi1may be preferable to setting wiˆ pÿ1i because the resulting wtjimay be less variable and closer to one.

Having then constructed the weights wi and wtji, the approach of Pfeffer- mann et al. (1998) may be applied to fit a model of form (14.2) where the vitare not serially correlated. This is Model A. The basic approach is to modify the IGLS estimation procedure by weighting all sums over i by the weights wiand weighting all sums over t by the weights wtji.

Often survey weights are only available in a scaled form; for example, so that they sum to the sample size. For inference about many regression-type models, as in Parts B and C of this book, estimation procedures for the model param- eters are invariant to such scaling. Although this is also true for multilevel modelling if the wi are scaled, it is not true if the weights wtji are scaled.

Pfeffermann et al. (1998) took advantage of this fact to choose a scaling to minimise small-sample estimation bias. In our context we consider scaling the weights wtjito construct the scaled weights wtjias

A MULTILEVEL MODELLING APPROACH 211

(8)

wtji ˆ t(i)wtji= Xt(i)

tˆ1

wtji

" #

where t(i) is the last wave at which individual i responds (1  t(i)  T).

Hence the average weight wtji for individual i across waves 1, . . . , t(i) is equal to one.

We now consider the question of how to adapt the approach of Pfeffermann et al. (1998) to allow for possible serial correlation of the vit in Model B. We follow an approach similar to that in Hsiao (1986, section 3.7), which is based on observing that if we know r then Model B may be transformed to the form of Model A by

yitÿ ryitÿ1ˆ (btÿ rbtÿ1) ‡ (1 ÿ r)ui‡ "it: (14:14) The estimation procedure involves two steps:

Step 1. Eliminate the random effect uiby differencing the responses yit Dit ˆ yitÿ yitÿ1, i 2 st, t ˆ 2, . . . , T

and estimate the linear regression model Ditˆ dt‡ gDitÿ1‡ Zit

by OLS weighted by the weights wit for observations i in the attrition samples st(t ˆ 2, . . . , T), where the parameters dt are unconstrained.

Under Model B, the least squares estimator ^g of g is consistent for g ˆ cov(Dit, Ditÿ1)=var(Ditÿ1) t ˆ 2, . . . , T

ˆ [ ÿ (1 ÿ r)s2"=(1 ‡ r)]=[2s2"=(1 ÿ r)]

ˆ ÿ(1 ÿ r)=2:

Set ^r ˆ 1 ‡ 2^g.

Step 2. Let ~yitˆ yitÿ ^ryitÿ1and fit the model obtained from (14.14) for the transformed data:

~yitˆ ~bt‡ ~ui‡ ~"it (14:15) using the approach of Pfeffermann et al. (1998) with the assumptions of Model A applying to the model in (14.15). The estimated variance of ~ui is then divided by (1 ÿ ^r)2to obtain the estimate ^s2u.

This two-step approach produces consistent estimators of the parameters of Model B but the resulting standard errors of ^s2u and ^s2" will not allow for uncertainty in the estimation of r.

Finally, we note that Pfeffermann et al. (1998) only allowed for the sample to be clustered into level 2 units. In the application in Section 14.4 the sampling design will also lead to geographical clustering of the sample individuals into

(9)

primary sampling units. The procedure for standard error estimation proposed by Pfeffermann et al. (1998) therefore needs to be extended to handle this case.

We shall not, however, consider this extension here, presenting only point estimates for the multilevel modelling approach in the next section.

14.4. AN APPLICATION: EARNINGS OF MALE EMPLOYEES IN

GREAT BRITAIN an application

In this section we apply the approaches set out earlier to fit random effects models to longitudinal data on the monthly earnings of male full-time em- ployees in Great Britain for the period 1991±5, using data from the British Household Panel Study (BHPS). The BHPS is a household panel survey, based on a sample of around 10 000 individuals. Data were first collected in 1991 and successive waves have taken place annually (Berthoud and Gershuny, 2000).

We base our analysis on the work of Ramos (1999). Like him, we consider only men over the first five waves of the BHPS and divide the men into four age cohorts in order to control for life cycle effects. These cohorts consist of men (i) born before 1941, (ii) born between 1941 and 1950, (iii) born between 1951 and 1960 and (iv) born after 1960.

The variable y is taken as the logarithm of earnings, with earnings being defined as the usual monthly earnings or salary payment before tax, for a reference period determined in the survey. We avoid the problem of zero earnings by defining the target population at wave t to consist of those men in the age cohorts who have positive earnings. It is thus possible for individuals to move in and out of the target population between waves. It is clearly plausible that the earnings behaviour of those moving in and out of the target population will differ systematically from those remaining in the target population. For simplicity, we shall, however, assume that the models defined in Section 14.1 apply to all individuals when they have positive earnings.

The panel sample was selected by stratified multistage sampling, with postal sectors as primary sampling units (PSUs). We use the standard linearisation approach to variance estimation for stratified multistage samples (e.g. SHS, p. 50). The BHPS involves a stratified sample of 250 PSUs. For the purpose of variance estimation, we approximate this stratified design as being defined by 75 strata, obtained by first breaking down each of 18 regional strata into 2 or 3

`major strata', defined according to proportion of `head of households' in professional/managerial positions, and then by breaking down each of these major strata into 2 `minor strata', defined according to the proportion of the population of pensionable age.

We first assess the fit of Models A and B (defined in Section 14.1) for each of the four cohorts. The results are presented in Table 14.1. We use goodness-of- fit tests based on the covariance structure approach of Section 14.2, with three choices of the matrix V in (14.8):

AN APPLICATION 213

(10)

Table 14.1 Goodness-of-fit test statistics for Models A and B for four cohorts and three estimation methods.

Model A Model B

Cohort

(when born) OLS GLS (iid) GLS

(complex) OLS GLS (iid) GLS (complex)

Before 1941 11.3 13.0 15.1 9.2 8.7 10.0

1941±50 41.2b 39.0b 39.9b 28.4b 27.0b 29.5b

1951±60 17.2 39.0b 43.3b 6.5 15.5 16.5

1960 29.1b 37.4b 35.5b 15.8 16.7 17.7

Notes: 1. Test statistics are weighted and are referred to the chi-squared distribution with 13 df for Model A and 12 df for Model B.

2.asignificant at 5 % level;bsignificant at 1 % level.

3. OLS and GLS (iid) test statistics involve Rao±Scott first-order correction.

OLS : V ˆ I, the identity matrix;

GLS (iid): V ˆ Viid, as defined in (14.13);

GLS (complex): V ˆ Vc, the linearisation estimator of the covariance matrix of AÃ, based upon (14.9), allowing for the complex design.

For V ˆ Vc, the test statistic is given by XW2 in (14.11) with the null distribution indicated in Section 14.2. For V ˆ I or Viid, the values of ^yGLSand Vcin (14.11) are replaced by the corresponding values of ^y and V and a first-order Rao±Scott adjustment is applied to the test statistic (SHS, Ch. 4). The same null distribu- tions as for Vc are used. Test statistics based upon second-order Rao±Scott approximations were also calculated and led to similar results. All of the test statistics are based on data from the attrition sample s5at wave 5, for individuals who gave full interviews at each of the five waves. Longitudinal weights wi5were used, which allow both for unequal sampling probabilities and for differential attrition from nonresponse over the five waves. To allow for the changing population, the expression for the estimated covariance matrix in (14.7) was modified by including only those who reported positive earnings at each wave in the estimation of the covariance between the log earnings at two waves.

The values of the test statistics in Table 14.1 are referred to a chi-squared null distribution with 13 degrees of freedom in the case of Model A and with 12 degrees of freedom in the case of Model B. The results suggest that Model A provides an adequate fit for the cohort born before 1941 but not for the other cohorts and that Model B provides an adequate fit for all cohorts, except the one consisting of those born between 1941 and 1950.

The values of the test statistics vary according to the three choices of V. The differences between the values of the test statistics for the GLS (iid) and GLS (complex) choices of V are not large, reflecting the fact that there is a large number of degrees of freedom for estimating the covariance matrix of ^A (relative to the dimension of the matrix) and that the pairs of V matrices tend not to be dramatically disproportionate. The value of the test statistic with V as

(11)

the identity matrix suggests a much better fit of both Models A and B for the 1951±60 cohort and a somewhat better fit for the cohort born after 1960. This may be because this test statistic tends to be sensitive to different deviations from the null hypothesis than the GLS test statistics. The 1951±60 cohort is distinctive in having less variation among the estimated variances of log earn- ings over the five waves and, more generally, displays the least evidence of non- stationarity. Because of the high positive correlation between the elements of ^A, the test statistic with V as the identity matrix may be expected to attach greater

`weight' to such departures from Model A than the GLS test statistics and this may lead to the noticeable difference in values for the 1951±60 cohort. Strong graphical evidence against Model A for this cohort is provided by Figure 14.1.

This figure plots the elements ^Stt0 of ^S in (14.3) against jt ÿ t0j and there is a clear tendency for the covariances to decline as the number of years between waves increases. This suggests that the insignificant value of the test statistic for Model A, with V as the identity matrix, reflects lack of power.

Estimates of the parameters in Model B are presented in Table 14.2 for the three cohorts for which Model B shows no significant lack of fit in Table 14.1. Estimates are presented for the same three choices of V matrix as in Table 14.1. While the estimates based on the two GLS choices of V are fairly similar, the OLS estimates, with V as the identity matrix, can be noticeably different, especially for the 1951±60 cohort. The effect of the differences for the cohort born after 1960 is illustrated in Figure 14.2, in which the estimated variances and covariances from (14.7) are presented together with fitted lines, joining the variances and covariances under Model B, implied by the parameter estimates in Table 14.2. The lines for the GLS choices of V are surprisingly low, unlike the OLS line, which passes through the middle of the points. Similar underfitting of the variances and covariances occurs for the other cohorts and this finding may reflect downward bias in such estimates employing

0 0

1 2 3 4

0.1 0.2 0.3

Variances and co-variances

Years apart

Figure 14.1 Estimated variances and covariances for cohort born 1951±60.

AN APPLICATION 215

(12)

Table 14.2 Parameter estimates for Model B for three cohorts using covariance structure approach.

Cohort

(when born) Estimator Parameter

r s2u s2e

Before 1941 OLS 0.37 (0.16) 0.165 (0.028) 0.049 (0.018) GLS (iid) 0.35 (0.16) 0.150 (0.024) 0.034 (0.011) GLS (complex) 0.32 (0.13) 0.143 (0.022) 0.034 (0.009)

1951±60 OLS 0.56 (0.11) 0.146 (0.021) 0.048 (0.015)

GLS (iid) 0.85 (0.09) 0.109 (0.047) 0.026 (0.047) GLS (complex) 0.85 (0.09) 0.106 (0.044) 0.026 (0.045) After 1960 OLS 0.49 (0.08) 0.155 (0.018) 0.071 (0.014) GLS (iid) 0.41 (0.07) 0.154 (0.016) 0.063 (0.010) GLS (complex) 0.40 (0.07) 0.150 (0.016) 0.061 (0.009) Notes: 1. Standard errors in parentheses.

2. Estimates are weighted and based only on data for attrition sample at wave 5.

3. 1941±50 cohort is excluded because of lack of fit of Model B in Table 14.1.

0 1 2 3 4

0.1 0.2 0.3

Variances and co-variances

Years apart

OLS GLS (lid) GLS (complex)

Figure 14.2 Estimated variances and covariances for cohort born after 1960 with values fitted under Model B.

sample-based V matrices, as discussed, for example, by Altonji and Segal (1996) and Browne (1984). The inversion of V implies that the lowest variances tend to receive most `weight', leading to the fitted line following more the lower envelope of the points than the centre of them. The potential presence of non-negligible

(13)

bias suggests that choosing V as the identity matrix may be preferable here for the purpose of parameter estimation, as concluded by Altonji and Segal (1996).

Table 14.3 shows for one cohort the effects of weighting, of the use of data from all attrition samples and of the use of the multilevel modelling approach of Section 14.3.

For the covariance structure approach, the impact of weighting is similar for all three choices of the matrix V. The fairly modest impact of weighting is expected here, since the BHPS weights do not vary greatly and are not strongly related to earnings.

The impact of using data from all attrition samples s1, . . . , s5, not just from s5, appears to be a little more marked than the impact of weighting. This may reflect the fact that the earnings behaviour of those men who leave the sample before 1995 may be different from those who remain in the sample for all five waves. In particular, this behaviour may be less stable leading to a reduction in the estimated correlation r. Control for possible informative attrition might be attempted by including covariates in the model.

Table 14.3 Parameter estimates for Model B for cohort born after 1960.

Estimator Parameter

r s2u s2e

Covariance structure approach

Using attrition sample at wave 5 only Weighted

OLS 0.49 0.155 0.071

GLS (iid) 0.41 0.154 0.063

GLS (complex) 0.40 0.150 0.061

Unweighted

OLS 0.45 0.166 0.078

GLS (iid) 0.37 0.161 0.068

GLS (complex) 0.35 0.156 0.066

Using all five attrition samples (weighted)

OLS 0.36 0.169 0.052

GLS (iid) 0.38 0.158 0.048

GLS (complex) 0.30 0.155 0.047

Multilevel modelling approach

Using attrition sample at wave 5 only

Weighted unscaled 0.41 0.169 0.041

Weighted scaled 0.41 0.167 0.042

Unweighted 0.41 0.170 0.045

Using all five attrition samples

Weighted unscaled 0.43 0.167 0.043

Weighted scaled 0.43 0.163 0.045

Unweighted 0.43 0.165 0.047

AN APPLICATION 217

(14)

The results for the multilevel modelling approach in Table 14.3 are based upon the two-step method described in Section 14.3. The estimated value of r is first determined and then estimates of ^s2uand ^s2e are obtained by the method of Pfeffermann et al. (1998) either with or without weights and, in the former case, the weights may be scaled or not.

The impact of weighting on the multilevel approach is again modest, indeed somewhat more modest than for the covariance structure approach. This may be because a common estimate of r is used here. Scaling the weights also has little effect. This may be because all the weights wtji are fairly close to one in this application and thus scaling has less of an impact than in the two-stage sampling application in Pfeffermann et al. (1998).

The differences between the estimates from the covariance structure ap- proach and the corresponding multilevel modelling approaches are not espe- cially large in Table 14.3 relative to the standard errors in Table 14.2.

Nevertheless, across all four cohorts and both models, the main differences in the estimates between methods were between the three choices of V matrix for the covariance structure approach and between the covariance structure and the multilevel approaches. The impact of weighting and the scaling of weights tended to be less important.

14.5. CONCLUDING REMARKS concluding remarks

It is often useful to include random effects in the specification of models for longitudinal survey data. In this chapter we have considered two approaches to allowing for complex survey designs and sample attrition when fitting such models. The covariance structure approach is particularly natural with survey data. The complex survey design and attrition are allowed for when making inference about the covariance matrix of the longitudinal responses. Modelling of the structure of this matrix may then proceed in a standard way. The second approach is to adapt standard multilevel modelling procedures, extending the approach of Pfeffermann et al. (1998).

The two approaches may be compared in a number of ways:

. The multilevel approach incorporates the different attrition samples more directly, although the possible creation of bias with unequal wtji(for given i) with small numbers of level 1 units (i.e. small T), as discussed by Pfeffer- mann et al. (1998), may be a problem.

. The multilevel approach incorporates covariates more naturally, although the extension of the covariance structure approach to include covariates using LISREL models is well established.

. The covariance structure approach handles serial correlation more easily.

. The covariance structure approach generates goodness-of-fit tests and resi- duals at the level of variances and covariances. The multilevel approach generates unit level residuals.

(15)

Finally, our application of the covariance structure approach to the BHPS data showed evidence of bias in the estimation of the variance components when using GLS with a covariance matrix V estimated from the data. This accords with the findings of Altonji and Segal (1996). This evidence suggests that it is safer to specify V as the identity matrix and use Rao ± Scott adjustments for testing.

CONCLUDING REMARKS 219

References

Related documents

The objec- tives of this study were to determine how different seeding rates and application rates of mepiquat- type plant growth regulator compounds (PGR) affected cotton growth

All of the different issues such as the inability to access a network, the inability to understand the life a student has a reason to value, and the inability to recognize

I examined the influence of a mass shooting on residential real estate sale price controlling for housing characteristics using a fixed effect regression model.. The

Write down what media you plan to use to publicize the checklist (for example, email, posters, etc.) as well as key people who will agree to work on publicity within each medium

The beneficial effects observed after cryopreservation are because of increasing cell cholesterol content and not to the cyclodextrins, because treating sperm with

Central banks normally use the average overnight interbank market rate (the EONIA rate in the European monetary union, the Fed funds rate in the US) as an indicator for the

First, from a signaling perspective, high IPO proceeds can constitute a credible signal of quality for firms in the post-IPO period when dealing with key stakeholders, such as

Uit het kwalitatieve deel van het onderzoek blijkt dat de documentaire een positief effect heeft gehad op de beroepstrots, status en imago of waardering voor de leraar van de