Introduction to Data Analysis in Hierarchical Linear Models
April 20, 2007
Noah Shamosh & Frank Farach Social Sciences StatLab
Yale University
Scope & Prerequisites
Strong applied emphasis
Focus on HLM software
Has special functionality
Other options: SPSS, SAS, MLWin, R
Familiarity with regression assumed
Road to HLM Happiness
Conceptualize model hierarchically
Prepare data
Import data into HLM
Build statistical models
Estimate and interpret models
Graph models
What is HLM?
Hierarchical Linear Model
A multilevel statistical model
Software program used for such models
Deconstructing the name (in reverse)
Model: It’s a statistical model
Linear: The model must be linear in the parameters
Hierarchical: Nested data structures are explicitly modeled
When are data hierarchical?
When units are grouped at higher units of analysis
Such data may be nested within higher levels (i.e., units) of analysis
Nesting can occur between subjects…
Children nested within classrooms
Classrooms nested within schools
…and/or within subjects
Repeated observations on the same individuals over time (observations nested within individuals)
Why not use regular
regression on nested data?
Increased Type I error
Model misspecification
Miss opportunity to examine potentially interesting contextual questions
These problems increase as
observations become less independent
Hierarchical Model Conceptualization
What kind of hierarchical relations might be present?
What factors could I incorporate in my
model to reflect this organization?
HLM Caveats
Adding levels of nesting increases the complexity of the model exponentially
HLM can handle up to three levels
Must have several times more lower level observations than upper level observations
Parameter estimation uses maximum
likelihood instead of least squares
Road to HLM Happiness
Conceptualize model hierarchically
Prepare data
Import data into HLM
Build statistical models
Estimate and interpret models
Graph models
Prep, prep, prep!
This is the most labor intensive part of
workflow, and is the source of many problems that come to us at the StatLab
Two obstacles
HLM doesn’t do data manipulation or basic data description
HLM requires a special data structure
Solutions
Plan ahead. Do all data screening, variable transformations, exploratory analyses, and assumption-checking beforehand
Data prep: SPSS example
1
Data set: IQ
v& language achievement
Two files
Level 1: dependent variable (language achievement) and other child
characteristics (e.g. IQv)
Level 2: school characteristics (e.g. SES)
Children are nested within schools
1 Extensively adapted from Bryk & Raudenbush (2002) and Bauer (2005)
Road to HLM Happiness
Conceptualize model hierarchically
Prepare data
Import data into HLM
Build statistical models
Estimate and interpret models
Graph models
Creating the Multivariate Data Matrix (MDM)
Making an MDM file
A caveat…
The procedure…
Check your summary statistics before building any models (cross-reference)
Main window: are all of your variables
there?
Road to HLM Happiness
Conceptualize model hierarchically
Prepare data
Import data into HLM
Build statistical models
Estimate and interpret models
Graph models
Build statistical models
Basic model: random-effects ANOVA
Test for mean group differences in population
Between-group vs. total variance
Key assumption check of HLM
Random-effects ANOVA
Choose outcome variable
Terms…
Toggle Level 2 error term
Level 1 (r) vs. Level 2 (u) error terms
The “Mixed” window
Random effects ANOVA
Language achievement
M1 M2 M3
GM
Road to HLM Happiness
Conceptualize model hierarchically
Prepare data
Import data into HLM
Build statistical models
Estimate and interpret models
Graph models
Random effects ANOVA
Results
Fixed effects: the intercept
Is the grand mean significantly different from zero?
Variance components (random effects)
Level 2 (U0): significant variability between groups?
Level 1 (R): significant variability within groups?
Random effects ANOVA
Intraclass correlation (ICC)
Proportion of total variance accounted for by between-group differences
Level 2 variance component divided by sum of Level 1 and Level 2 variance
components
Ours is .23; HLM is warranted
Road to HLM Happiness
Conceptualize model hierarchically
Prepare data
Import data into HLM
Build statistical models
Estimate and interpret models
Graph models
Random effects regression
Test for relationship between a Level 1 IV and the DV
Test whether an IV explains any between groups variance
Terms…
We are assuming a fixed slope
Random effects regression
IQ
Language achievement
Road to HLM Happiness
Conceptualize model hierarchically
Prepare data
Import data into HLM
Build statistical models
Estimate and interpret models
Graph models
Random effects regression
Results
Fixed effects
Level 1 intercept: Mean of DV where IV is zero
Level 1 slope: Change in DV with one unit of change in IV (just like OLS regression)
Random effects
Intercept: Between-group variance that is not explained by IV
Residual variance: Within-group variance that is not explained by DV
Random effects regression
Variance accounted for by IV
Level 1: Compare residual variance component to random effects ANOVA model
(8.0 - 6.5) / 8.0 = .19
Level 2: Do the same for the random intercept variance component
(19.6 - 9.6) / 19.6 = .51
Fixed slopes
IQ
Language achievement
Random slopes
IQ
Language achievement
Random slopes
Goal: test whether the IV - DV
relationship varies between groups
Add only if supported by theory
Toggle Level 2b error term
In output, look at slope variance
component
Slopes as outcomes
Goal: test cross level interactions
Does the between-group variability in the IV - DV relation vary by a systematic
factor?
Add Level 2 predictor
Terms…
Slopes as outcomes
Fixed effects
For Level 1 intercept
Intercept: predicted score on DV at mean value of L-1 IV
Slope: Influence of Level 2 IV on DV
For Level 1 slope
Intercept: Influence of Level 1 IV on DV
Slope: Influence of L-2 IV on L-1 IV - DV relation
Random effects (same as before)
Road to HLM Happiness
Conceptualize model hierarchically
Prepare data
Import data into HLM
Build statistical models
Estimate and interpret models
Graph models
Graph: Simple slopes
Useful for visualizing cross-level interactions
Just like simple slope plots in regression
Graph Equations > Model graphs
Useful for categorical or continuous
data
Graph: Level-1 equations
Useful for:
Visualizing variability in intercepts and slopes
Identifying moderators
Graph Equations > Level 1 equation
graphing
Recommended Reading
Bickel, R. (2007). Multilevel analysis for applied research: It's just regression! New York: Guilford Press.
Bryk, A. & Raudenbush, S. (2002). Hierarchical Linear Models:
Applications and data analysis methods (2nd ed.). Thousand Oaks, CA: Sage.
Luke, D. (2004). Multilevel modeling. Thousand Oaks, CA:
Sage.
Heck, R. H., & Thomas, S. L. (2000). An introduction to
multilevel modeling techniques. Lawrence Erlbaum Associates.
Kreft, I. & de Leeuw, J. (1998). Introducing multilevel modeling.
Sage.
Singer, J. D., & Willett, J. B. (2003). Applied Longitudinal Data Analysis: Modeling Change and Event Occurrence. Oxford Univ.
Press. (Longitudinal focus)
HLM Resources on the Web
UCLA’s HLM portal
http://statcomp.ats.ucla.edu/mlm
Excellent example of analysis
http://www.ats.ucla.edu/stat/hlm/seminars/
hlm_mlm/mlm_hlm_seminar.htm