Model Building - STATISTICAL MODELING AND ESTIMATION

4.6 STATISTICAL MODELING AND ESTIMATION

4.6.1 Model Building

In general, multi-level models are composed of two parts which are respectively the “structural” and the “stochastic” part (Singer and Willett, 2003). The structural part is composed of fixed effects that do not vary across sampling units. Fixed effects are, in essence, population specific estimates. They define the means for a population and thus can be considered as “pooled” or mean effects, which is why the “structural part” of a multi-level model can be construed as the mean model. The “stochastic” part of a multi-level model contains random effects that vary across sampling units and thus estimates subject-specific effects. Specifically, it estimates two types of random effects: between-subject random effects and within-subject random errors (Kwok et al., 2008). While between-subject random effects account for variance heterogeneity between responses from distinct sampling units and thus variable intercepts and slopes, within-subject random errors account for covariance patterns between responses coming from the same sampling unit and thus time-related dependence of observations (Cheng et al., 2010). Hence I will select the covariance structure that provides the best fit to data for each and every model in order to obtain valid inferences for the fixed effects and subject specific estimates

134

(Singer and Willett, 2003). I will not build up the fixed part of the models because this part is guided and developed by the theory that I develop.

To build the stochastic component of each of the three multi-level models and select the one that provides the best “fit” to data at hand, I follow the top-down method proposed by Bliese and Ployhart (2002) and Hox (2010). I first compare alternative covariance structures for between-subject random effects to select the “best” covariance structure. I start with the simplest covariance structure for between subject random effects and proceed by adding an additional random effect to this covariance structure step by step. To move from more simple to more complex covariance structures, I cumulatively include the theoretically grounded variables as random effects in the covariance structure for between subject random effects. For example, to investigate Hypotheses 1a and 1b, I examine the goodness-of-fit of four models with alternative covariance structures for between-subject random effects and each of the preceding covariance structures is nested in the subsequent covariance structure because the subsequent covariance structures include an additional random effect that the preceding covariance structure lacks. The first model allows the intercept to vary across sampling units; the second model lets the intercept and the effect of MMC to vary across dyads; the third model allows the intercept and the impact of MMC and norms to vary across dyads; the fourth model allows the intercept and the effects of norms, MMC and performance failure to vary across dyads. Since the covariance structures that I specify get successively more complex, I do not end up selecting an over-identified model. I model between-subject random effects through unstructured (UN) covariance structure, which models variances and co-variances of between-subject random effects, and banded main diagonal covariance structures (UN (1)), which solely models variances of between-subjects random effects and constrains its off-diagonal elements to zero.

135

Following the selection of the “best” covariance structure for between-subject random effects for a given model, I compare competing covariance structures for within-subject random errors and select the one that provides the “best” fit given the theoretically defined fixed effects and empirically determined between-subject random effects (Cheng et al., 2010; Bliese and Ployhart, 2002). Specifically, I investigate variance components (VC), compound symmetry (CS) and first-order autoregressive (AR (1)) covariance structures and compare their goodness- of-fit to select the structure that provides the best fit to the data (Bliese and Ployhart, 2002). I build the error covariance after determining the “best” covariance structure for between-subject random-effects because it is what remains after removing the effects of fixed and random variables in each and every model (Singer, 1998).

I compare alternative models with identical fixed effects but with different covariance structures through either log-likelihood ratio tests or information criteria. When models are nested within one another, I use the log-likelihood ratio (LR) test to select the winning model. “With Lj, the log likelihood for model j, the LR test statistic T=-2 (L1-L2) asymptotically follows

and therefore is referred to as x2d distribution, where d is the difference in the number of

parameters between two models” (Cheng et al., 2010, p: 511). When models are not nested, I use the information criteria to select the covariance structure that provides a better fit. Specifically, I compare the AIC statistic and BIC statistic of different models that have identical fixed-effects but different covariance structures and prefer smaller AIC and BIC statistics to larger AIC and BIC statistics. I consider the AIC and BIC statistics of a given model to be “small enough” compared to the AIC and BIC statistics of an alternative model when its AIC and BIC statistics are at least two units lower than the corresponding AIC and BIC statistics of the alternative model (Singer and Willett, 2003).

136

After selecting the model with the stochastic component that provides the “best” fit, I check whether it meets the assumptions of multi-level modeling since a model’s estimates will be biased and inferences will be erroneous if it violates the assumptions of multi-level modeling. Multi-level model incorporates several assumptions. It presumes that level-1 and level-2 predictors are independent of corresponding level-1 and level-2 residuals respectively; that level- 1 and level-2 errors are independent; that level-1 residuals are independent and normally distributed with a mean of zero and variance σ2

ε; that level-2 random effects are multivariate

normal, each with a mean of zero and a variance of σ2qq and a covariance of σ2qq′; that estimates

are linear in parameters (Singer and Willett, 2003). When I find that the selected model violates the assumptions of multi-level modeling, I carry out the necessary fixes to ensure that the violations do not impair inferences.

In addition to finding and estimating the model that provides the “best” fit, I also estimate an “Unconditional Means” model for the first, second and third model and a “Cross-Classified Unconditional Means” model for the second and third model in order to partition outcome variance into its components, calculate the intra-class correlation coefficient (ICC) and justify utilization of cross-classified random effect models to investigate Hypotheses two, three, four, five, six and seven. An “Unconditional Means” model, or a “Cross-classified Unconditional Means” model, can be considered a one-way random effects ANOVA model (Singer, 1998). It does not contain any predictors and thus forces all of the variance in the dependent variable to reside in the composite error term that is composed of between-subject random effects and within-subject random errors. Hence it partitions the outcome variation into its within-subject and between-subject variance components and thus estimates level-1 and level-2 residuals, enabling me to understand not only the level and source of variance, but also calculate ICC

137

(Singer and Willett, 2003). Calculating ICC enables me to test whether there is significant

variation over time in cooperation within a sampling unit and whether sampling units

significantly differ from one another with respect to their level of cooperation.

In addition to these models, I estimate a “control” model that contains only the control variables of a given multi-level model. I use the results of this model to understand the behavior of control variables when theoretical predictors are not included.

In document Securitizing British India: A New Framework of Analysis for the First Anglo Afghan War (Page 148-152)