**Analyzing Data from Small N Designs Using Multilevel Models: **

**Analyzing Data from Small N Designs Using Multilevel Models:**

**A Procedural Handbook **

Eden Nagler, M.Phil The Graduate Center, CUNY

David Rindskopf, PhD Co-Principal Investigator The Graduate Center, CUNY

William Shadish, PhD Co-Principal Investigator University of California - Merced

Granting Agency: U.S. Department of Education Grant Title: Meta-Analysis of Single-Subject Designs Grant No. 75588-00-01

**Analyzing Data from Small N Designs Using Multilevel Models: **

**Analyzing Data from Small N Designs Using Multilevel Models:**

**A Procedural Handbook **

**SECTION I. Introduction **

The purpose of this handbook is to clarify the steps in analyzing data from small-*n* designs using

multilevel models. Within the manual, we have illustrated the procedures taken to conduct the
analysis of a single-subject design small-*n* study of various single- and multiple-phase designs.

Although we attempt to discuss our work in detail, readers should have some acquaintance with multilevel models (also called hierarchical models, or mixed effects models).

The conceptual basis of these analyses is:

• Write a statistical model to summarize the behavior of each person

• Test whether there are differences among people in various aspects of their behavior; and if so

• Test whether those differences are predicted by subject characteristics.

While searching through the literature for appropriate single-subject design studies to serve as pilots for this handbook, we looked to identify studies that adhered to several guidelines: Studies should include full graphs for at least 4 or 5 subjects

Counts and/or measures displayed (as the dependent measure) should not be aggregated Data provided are as close to raw as possible

These demonstration datasets lead from simple to more complex designs and from simpler to more complex models. We begin with one-phase, treatment-only designs and continue through to four-phase ABAB reversal design studies.

We demonstrate how to scan in graphed data and how to extract raw data from those graphs using computer software. We talk about how to deal with different types of dependent variables which require different statistical models (e.g., continuous, count or rate, proportion).

Additionally, this type of data often contains autocorrelation. We also discuss this problem and one way of dealing with it.

In *Section II*, we introduce procedures via demonstration with a dataset from a one-phase,

treatment-only study of a weight loss intervention where the outcome variable is a continuous variable. Here, we cover the following:

o Scanning graphs into Ungraph

o Using Ungraph to extract raw data from graphs into spreadsheet (*line graph*) and then

export data into SPSS

o Using SPSS to refine and set up data for HLM

o Using HLM to set up a summary (mdm) file, specify and run models with a continuous

o Interpreting output

In *Section III*, we expand the demonstration with a dataset from a two-phase (AB) design

study of a prompt-and-praise intervention for toddlers where the outcome variable is a count or rate. New material covered in this section includes the following:

o Introduction of Poisson distribution (prediction on a log scale), including a discussion of

technical issues associated with a count as the DV (Poisson distribution, many zeros, using a log scale, etc)

o Using Ungraph to read in a scatterplot

o Using HLM to set up and run a model to accommodate a rate as a DV o Interpreting HLM output with prediction on a log scale

o Technical discussions of the following:

Considering the contribution of subject characteristics (L2 predictors)

Exploring whether one subject stands out (when baseline for that subject is always zero; comparing across alternative models)

Constraining random effects (restricting between-subject variation to 0) and comparing across models

Exploring heterogeneity of Level-1 variance across phases (within-subject variation) and comparing across models

In *Section IV*, we further expand the demonstration with a dataset from a two-phase (AB)

design study of a collaborative teaming intervention for students with special needs where the outcome variable is a proportion (i.e., successes/trials). New material covered in this section includes the following:

o Introduction of Binomial distribution (prediction on a log odds scale) o Using SPSS to set up data for use of Binomial distribution model

o Using HLM to set up model for a variable distributed as Binomial; set option for

Overdispersion, and run models as Binomial

o Interpreting HLM output with prediction on a log odds scale

o Technical discussion and demonstration on Overdispersion (including comparing across

models)

Finally, in *Section V*, we demonstrate the steps with a dataset from a four-phase (ABAB)

reversal design study of a response card intervention for students exhibiting disruptive behavior where the outcome variable is again a proportion. New material covered in this section includes the following:

o Introduction of analyses of four-phase designs, including consideration of phase order o Using SPSS to set up data for use for a four-phase model, to test order effects and various

interactions

o Using HLM to set up a model for the four-phase design, using the Binomial distribution,

testing order effects and various interactions

o Special coding possibilities for a four-phase design o Interpreting HLM output from this type of design

**SECTION II. One-Phase Designs **

One published study was selected to serve as an example to be used throughout this section:
**Stuart, R.B. (1967). Behavioral control of overeating. Behavior Research & **

**Therapy, 5, (357-365). **

In this study, eight obese females were trained in self-control techniques to overcome overeating behaviors. Patients were weighed monthly throughout the 12-month program and these data were graphed individually. Data and graphs from this study will be used to illustrate various steps in the analysis discussed in this manual.

The following pages will illustrate the steps necessary to get the data from print into HLM, to do an analysis, and interpret the output. These procedures utilize three computer packages:

Ungraph, SPSS, and HLM. Screen shots are pasted within the instructions.

Outline of steps to be covered: 1. Scan graphs into Ungraph 2. Define graph space in Ungraph

3. Read data from image to data file in Ungraph 4. Export data file into SPSS

5. Refine data as necessary in SPSS (recodes, transformations, merging Level-1 files, etc.) 6. Set up data in HLM (including setting up MDM file)

7. Run multilevel models in HLM

8. Create graphs of data and models in HLM 9. Interpret HLM output

**Getting Data from Print into Ungraph **

**I. Scanning data to be read into Ungraph (via flatbed scanner): **

1. Scan graphs into jpeg (or .emf, .wmf, .bmp, .dib, .png, .pbm, .pgm, .ppm, .pcc, .pcx, .dcx, .tiff, .afi, .vst, or .tga) format through any desired scanning software.

2. Save image of graph (e.g., to Desktop, My Documents folder, CD, flash drive, etc.) and label for later retrieval.

* Example: *Scanned Stuart (1967) graphs for each patient; all are in one jpeg file.

**II. Defining graph space in Ungraph: **

1. Start the Ungraph program.

*(Note: If Ungraph was originally registered while connected to the Internet, then it *
*will only open [with that same password] while connected to the Internet each time. *
*It does not have to be connected at the same Internet port, just any live connection.) *

2. Open the scanned image(s) in Ungraph: Select File Æ Open

Browse to the intended image (scanned graph) and click Open so that the graph(s) is displayed in the workspace.

Scroll left/right, up/down to get the first subject’s graph fully visible in the workspace.

Use View ÆZoom In/Out as needed to optimize the view.

3. Define measures: Select Edit Æ Units

Label X and Y accordingly (using information from the scanned graphs or the study
documentation) and click *OK*.

In our example, X is *months* and Y is *lbs*.

**Example: Stuart (1967) – Patient 1 – defining units.**

4. Define the Coordinate System:

Select Edit Æ Define Coordinate System

The program requires that you define 3 points for each graph. These do not have to be points on the data line. In fact, you can be more precise if you choose points on the axes. Choose points that are relatively easily definable.

1. First scaling point – click on labeled point most to the right on the Y axis (X=max, Y=min).

**Example: Stuart (1967) – Patient 1 – First Scaling Point defined (1)**

2. Deskewing point – this point must have the same Y value as above, so click on the intersection of the axes (which may or may not be the origin) (X=min, Y=min).

**Example: Stuart (1967) – Patient 1 – Deskewing Point defined (2)**

3. Second scaling point – click on a labeled point closest to the upper-lefthand corner of the graph (X=min, Y=max).

**Example: Stuart (1967) – Patient 1 – Second Scaling Point defined (3) **

3

**III. Reading in & Working with data in Ungraph: **
1. Reading data from graph:

If working with a line graph:

Select Digitize Æ New Functional Line

Carefully click on left-most point on the graph line (on the Y axis) and watch Ungraph trace the line to the end.

If the digitized line runs off beyond the actual line, you can click ALT + left-arrow (Å) to back up the digitization little by little

You may need to try this step a few times before Ungraph follows the
line precisely. Click *Undo* (at bottom of screen) to erase any

incorrectly-digitized line and start again.

** Example: Stuart (1967) – Patient 1 – Digitize Functional Line **

If the data are in a scatterplot: Select Digitize Æ New Scatter

Carefully click on each data point in the graph to read in data 2. Working with extracted data:

Data values are computed as if they were collected continuously. For instance, even if data were actually collected once per month,

Ungraph may still show points for non-integer X values (e.g.,: 1.13 months, etc.), falsely assuming continuity.

If the line was digitized as a functional line, then you can correct this in Ungraph. (Otherwise you may have to use rounding in SPSS, etc.)

On the right side of the screen, under *Data*, click the *Show* drop-down

menu and choose *Cubic Spline*.

Select points from X = 0,

in increments of 1.0 (in order to get measurements by X = whole numbers).

Click *Apply*

** Example: Stuart (1967) – Patient 1 – Refine read data **

3. Exporting Data:

Select Data Æ Export

Decide how to format points (tab separated, comma separated, etc.)

Click on Export and save .txt file where you will be able to find it later. Make
sure to label file clearly (including source and case name or ID number).
(ex: *stuart1967 patient1.txt*)

4. Repeat EACH of these steps in sections II and III (from defining graph space to reading in and exporting data) for each Level-1 (subject) graph available.

Save each of the Level-1 files as separate .txt files labeled by case name or ID number.

**Getting Data from Ungraph into SPSS **

**IV. Importing and Setting Up Level-1 Data in SPSS:**

1. Open SPSS program.

2. Read text (.txt) file into SPSS: Select File Æ Read Text Data

Browse to first Level-1 text (.txt) file (Patient 1)

Click *Next* 3 times (or until you get to the screen below)

At the screen that asks *“Which delimiters appear between variables?” *
*(Delimited Step 4 of 6)*, check off whichever delimiters you specified when

exporting data from Ungraph (tab, comma, etc).

**Example: Stuart (1967) – Patient 1 – Reading text file into SPSS **

Click *Next* to advance to the next screen

Title variables:

**Example: Stuart (1967) – Patient 1 – Reading text file into SPSS **

Click *Next* to advance to the next screen

Finally, click *Finish* to complete set-up of text data.

3. Dataset should now be displayed in Data View screen. Title/label variables as necessary in Variable View.

4. Compute subject’s ID for data:

For this study, we computed Patient ID for each subject by running the following syntax (where the value “1” is changed for each subject respectively):

COMPUTE patient=1. EXECUTE.

5. Save individual subject SPSS data files:

Save the SPSS data file for the first “patient” (in this study). Make sure to
include the subject’s ID in the file name so that you will be able to identify it
later. (ex: *stuart1967 patient1.sav*)

6. * Repeat steps 1 through 5 above for each subject in the study (for each of the text *
files created from each of the graphs scanned) creating separate Level-1 files for
each subject/patient/unit.

Be sure to compute appropriate subject ID’s for each subject.

For example, in the study used in this manual, we ended up with 8 separate Level-1 files. In the first, we computed patient=1, in the second, we computed patient=2,… and so on… until the eighth, when we

computed patient=8.

As well, each file was saved with the same file name except for the corresponding patient ID.

7. Now that you have uniform SPSS files for each subject, you must merge them. Merge data files for each subject into one Level-1 file. (Select Data Æ Add cases, etc.)

**Example: Stuart (1967) – Merged Level-1 SPSS file. **

9. In the merged file, you may wish to make additional modifications to the variables. For this dataset, we decided to make three such

modifications/transformations:

First, we rounded “*lbs*” to the nearest whole number, with the following

syntax command:

COMPUTE pounds = rnd(lbs). EXECUTE.

Second, for more meaningful HLM interpretation, we decided to recode months so that 0 represented ending weight, instead of starting weight. We did this with the following syntax command:

COMPUTE months12 = months-12. EXECUTE.

Last, we computed a quadratic time term (months2) so that we may later test for a curvilinear trend when working in HLM. We ran the following syntax to compute this variable:

COMPUTE mon12sq = months12 ** 2. EXECUTE.

10. For some models, you will need to create indicator variables. See HLM 6 Manual – Chapter 8.

11. After making all modifications and sorting by ID, re-save complete Level-1 file.
**Example: Stuart (1967) – Complete Merged Level-1 SPSS file. **

**Next: Setting up Level-2 data in SPSS **** **

**V.** **Entering and Setting Up Level-2 Data in SPSS **

1. Create SPSS file including any Level-2 data (subject characteristics) available: Make sure to use corresponding subject IDs to those set up in Level-1 file. There should be one row for each subject.

There should be one column for subject ID. Remember to use corresponding IDs to the Level-1 file. Also, variable name, type, etc should match Level-1 set-up of the ID variable.

Other columns should include data revealed in the study about each subject. For example, in this study, we had data on Age, Marital Status, Total

Sessions attended, and Total Weight Loss.

2. You may decide later to go back and recenter or redefine these variables for more meaningful HLM interpretation.

For example, in this dataset, the average age of a subject was just above
30. In order to allow for simpler interpretation, we computed Age30 =
age-30, so that Age30=0 would represent a person of about average age.
**Example: Stuart (1967) – Level-2 Dataset in SPSS **

**Getting Data from SPSS into HLM **

In this section, we discuss the simplest models that do not use indicator variables. In a later section, we will consider other models for the covariance structure.

**VI. Setting up MDM file: (***Note: For HLM versions 5 and below, create an SSM file; *
*for versions 6 and higher, create an MDM file.) *

1. Open HLM program. (Make sure all related SPSS files are saved and closed.) 2. Select File Æ Make new MDM file Æ Stat package input

**Example: Stuart (1967) – Setting up new MDM. **

3. On next window, leave HLM2 bubble selected and click OK.
**Example: Stuart (1967) – Setting up new MDM. **

4. Label MDM file:

At top right of Make MDM screen, enter MDM file name, making sure to end
in *.mdm*. Example: “*stuart1967.mdm*”

Make sure that Input File Type indicates SPSS/Windows. 5. Specify structure of data:

In this case, our data was nested within patients so under *Nesting of input data*

we selected *measures within persons.*

6. Specify Level-1 data:

Under Level-1 Specification, click on *Browse* and browse to saved Level-1

SPSS file (the merged one). Click *Open*.

Once your Level-1 file has been identified, click on *Choose Variables.*

Check off your subject ID variable as *ID*.

Check off all other wanted variables as *in MDM.*

Click OK.

7. Specify Level-2 data:

Under Level-2 Specification, click on *Browse* and browse to saved Level-2

SPSS file. Click Open.

Once your Level-2 file has been identified, click on *Choose Variables.*

Check off your subject ID variable as *ID*.

Check off all other wanted variables as *in MDM.*

Click OK.

**Example: Stuart (1967) – Choosing variables for Level-2 data. **

8. Save Response File:

On top left of Make MDM screen, click *Save mdmt file*.

Name file and click *Save*.

9. Make MDM:

On bottom of screen, click on *Make MDM*. A black screen will appear and

10. Check Stats:

On bottom of screen, click *Check Stats.* Examine descriptive statistics as a

preliminary check on data. 11. Done:

Click on *Done*.

**Running Multilevel Models in HLM (Linear and Quadratic) **

**VII. Setting up the model:**

As evident from the graphs, each person lost weight at a fairly steady rate. We first fit a straight line for each person, allowing the slopes and intercepts to vary across people. Late, we test whether a curve would better describe the loss of weight over time.

¾ **LINEAR MODEL - **

With MDM file (just created) open in HLM, 1. Choose outcome variable:

With Level-1 menu selected, click on *POUNDS* and then *Outcome variable* to

specify weight as outcome measure.

** Example: Stuart (1967) – Setting up models in HLM **

Identify which Level-1 predictor variables you want in the model. (Often, the only such predictor variable will be a time-related variable.):

Click on *MONTHS12* (or whichever variables you want in the Level-1

** Example: Stuart (1967) – Setting up models in HLM **

3. Activate Error terms:

Make sure to activate relevant error terms (depending on model) in each Level-2 equation by clicking on the error terms individually (r0 is included by default; others much be selected).

In this case, we activated all Level-2 error terms.
** Example: Stuart (1967) – Setting up models in HLM **

4. Title output and graphing files:
Click on *Outcome*

Fill in *Title* (this is the title that will appear printed at the top of the output text

file).

Fill in *Output File Name* and location (this is the name and location where the

output file will be saved); and *Graph File Name* and location (this is the name

and location where the graph file will be saved).

Click *OK* to save these specifications and exit this screen.

** Example: Stuart (1967) – Setting up models in HLM **

5. Exploratory Analysis:

Select *Other Settings*Æ*Exploratory Analysis (Level-2)*

Click on each Level-2 variable that you want to include in the exploratory
analysis and click *add*. (In this case, we selected age30, marital status, and

total sessions.):

** Example: Stuart (1967) – Setting up Exploratory Analysis. **

Click on *Return to Model Mode* at top right of screen.

6. Run the analysis

At the top of the screen, click on *Run Analysis.*

On the pop-up screen, click on *Run model shown.*

A black screen will appear, and then close. 7. View Output:

Select *File ÆView Output*

Output text file will open in Notepad.

*Note: You may also open the output file directly by browsing to its saved *
*location (specified in Outcome menu) from outside HLM. *

¾ **QUADRATIC MODEL **

The quadratic model was set up just like the linear model EXCEPT for the following:
When defining the variables in the model, we also included *MON12SQ* (the quadratic

term) in the Level-1 equation.

In the Exploratory Analysis, we requested the same Level-2 variables to be explored in each of the equations, now also including the quadratic term equation.

File names and titles were changed to identify this as the quadratic model.

** **

**Creating Graphs of the Data and Models in HLM **

**VIII. Line Plots of the Data:**

1. After creating MDM file, click *File*Æ* Graph Data Æline plots, scatter plots *

**Example: Stuart (1967) - Creating Line Graph of Data in HLM **

2. Choose X and Y variables:

From the drop-down menu, choose the X variable for your data graph. This should be the time-related variable. In this example, our X variable is

*MONTHS12*.

From the drop-down menu, choose the Y variable for your data graph. This
should be the dependent variable. In this example, our Y variable is *POUNDS*.

3. Select number of groups to display in graph:

From the drop-down menu at the top-right of the window, select the number
of groups to display. In this example, we are actually selecting the number of
individuals for whom the graph will display nested measurements. Choose *All *
*groups (n=8)*.

**Example: Stuart (1967) - Creating Line Graph of Data in HLM **

4. Select Type of Plot:

Under *Type of plot*, select *Line plot* and *Straight line*.

Under *Pagination* at bottom-right of screen, select *All groups on same graph.*

**Example: Stuart (1967) - Creating Line Graph of Data in HLM **

6. Click *OK* to make line plot of data.

**Example: Stuart (1967) - Creating Line Graph of Data in HLM **

**IX. Line Plots of the Level-1 Model(s): **
¾ **LINEAR MODEL GRAPHING **

1. After running the linear model in HLM 6, Click *File*Æ*Graph Equations*Æ
*Level-1 equation graphing*

**Example: Stuart (1967) – Creating Line Graph of Linear Model **

2. Select X focus variable:

From the drop-down menu, select the X focus variable for linear model graph. In this example, we chose MONTHS12.

**Example: Stuart (1967) – Creating Line Graph of Linear Model **

3. Select number of groups to display in graph:

From the drop-down menu, select the number of groups to display. Choose

*All groups (n=8)*.

4. Click *OK* to get line graph of the linear prediction model. If the linear model is

right, this describes the weight loss trajectory for each of the eight subjects.
**Example: Stuart (1967) – Creating Line Graph of Linear Model **

¾ **QUADRATIC MODEL GRAPHING **

1. After running the quadratic model, Click *File* --> *Graph Equations* --> *Level-1 *
*equation graphing*

2. Select X focus variable:

From the drop-down menu, select the original X variable. (This will be further
defined in a later step.) In this example, we chose *MONTHS12*.

**Example: Stuart (1967) – Creating Line Graph of Quadratic Model **

3. Select number of groups to display in graph:

From the drop-down menu, select the number of groups to display. Choose

*All groups (n=8)*.

**Example: Stuart (1967) – Creating Line Graph of Quadratic Model **

4. Specify relationship between original time variable (*MONTHS12*) and

transformed/ quadratic time variable (*MON12SQ*).

Under *Categories/transforms/interaction*, click *1* and *power of x/z* to define

quadratic relationship.

Choose transformed variable (in this case, *MON12SQ*) and define in terms of

original variable (here, *MONTHS12* to the power of 2).

Click *OK.*

**Example: Stuart (1967) – Creating Line Graph of Quadratic Model **

5. Click *OK* to get line graph of the quadratic prediction model. If the quadratic

model is right, this describes the weight loss trajectory for each of the eight subjects.

**Interpreting HLM Output**

Note on typographic conventions
Different fonts indicate different sources of information presented:

Where we present our own interpretation and discussion, we use the Times New Roman font, as seen here.

Where we present output from HLM, we use the Lucinda

Console font, as used in the HLM Output text files opened in Notepad, and as seen here.

The Stuart (1967) study included data on eight subjects undergoing a weight loss program. Patients were weighed each month, and weight in pounds was recorded. Additional data were available on a few patient characteristics (e.g., age, marital status, total sessions attended). These variables had not been explored as potential explanatory factors in weight loss variations.

Hierarchical linear modeling (HLM) was utilized to: (1) model the change in weight for each person, and (2) combine results of all women in the study so that we may examine trends across the study and between patients. Multiple observations on each individual (n=13 observations throughout the one-year treatment) were treated as nested within the patient.

(We focus on statistical analysis here, but note that any inference about causal effect in this study requires strong assumptions. All patients received the same treatment, and there was no period to collect baseline data. Presumably these patients had stable weight for some long period of time before beginning treatment. Another implicit assumption is that most or all of the weight loss observed was due to treatment, and not to a “placebo” or Hawthorne effect, nor to natural changes in body chemistry.)

A line graph, produced in SPSS, plotting weight in pounds by month of treatment for each patient is presented below. Each line represents the weight loss trend of one patient in the study over the 12-month treatment. The graph suggests that weight loss trends may not be uniform across patients (i.e., lines are not quite parallel). Hierarchical linear

modeling (HLM) allow us to examine the significance of patient characteristics that may account for variations in weight loss slopes. As well, the line graph suggests that the line of best fit may not simply be linear but rather include a quadratic term to account for a slight curvature in the data.

**Figure 1. Stuart (1967) – Line graph of weight loss by patient. **

Weight Loss by Patient
(Stuart, 1967)
MONTHS12
0
-1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
POUNDS
240
220
200
180
160
140
120
PATIENT
1
2
3
4
5
6
7
8
**LINEAR MODEL – **

We initially chose the Stuart (1967) study to serve as a simple example in how to use HLM to analyze single subject studies. Though we later realized this data would not produce such a simple HLM interpretation (e.g., the need to include a quadratic term), we decided to discuss the simpler linear model as an introduction to the more complex model to follow.

After setting up the MDM file, we identified POUNDS as the outcome variable and directed HLM to include MONTHS12 (computed previously in SPSS) in the model (uncentered). This resulted in a test of the model(s) displayed below. (These equations are from the HLM output and omit subscripts for observations and individuals.)

Summary of the model specified (in equation format) --- Level-1 Model Y = P0 + P1*(MONTHS12) + E Level-2 Model P0 = B00 + R0 P1 = B10 + R1

The Level-1 equation above states that POUNDS (the weight for a patient at a particular time) is the sum of 3 terms: weight at the intercept (in this case, when MONTHS12=0, this is the ending weight), plus a term accounting for the rate of change in weight with time (MONTHS12), plus an error term.

This simple linear model does not include any Level-2 predictors (patient characteristics). The Level-2 equations model the intercept and slope as:

P0 = The average ending weight for all patients (B00), plus an error term to allow each

patient to vary from this grand mean (R0). P0 is the intercept of the regression line

predicting weight from time.

P1 = The average rate of change in weight per month (MONTHS12) for the 8 participants

(B10), plus an error term to allow each patient to vary from this grand mean effect
(R1). *Note: Remember that MONTHS12 was recoded so that 0=ending weight and *
*-12=starting weight. *

The following estimates were produced by HLM for this model: Final estimation of fixed effects:

--- Standard Approx.

Fixed Effect Coefficient Error T-ratio d.f. P-value --- For INTRCPT1,P0

INTRCPT2, B00 156.439560 5.053645 30.956 7 0.000 For MONTHS12 slope, P1

INTRCPT2, B10 -3.078984 0.233772 -13.171 7 0.000 --- The outcome variable is POUNDS

When MONTHS12=0 (end of treatment), the overall average weight for all patients is 156.4396 (B00). This is the average ending weight for all patients.

The average rate of change in weight per month (MONTHS12) is -3.0790 (B10); meaning that for each month in treatment (1-unit increase in MONTHS12), weight decreases, on average, just over 3 pounds. This decrease is statistically significant as the p-value for B10 is less than .05.

Next, we must examine the variances of R0 and R1 (called taus in the HLM model) to determine if this average model fits suitably for all patients in the study.

Final estimation of variance components:

--- Random Effect Standard Variance df Chi-square p-value Deviation Component --- INTRCPT1, R0 14.23444 202.61939 7 843.67874 0.000 MONTHS12 slope,R1 0.63505 0.40329 7 90.26605 0.000 Level-1, E 2.48405 6.17052 --- The between-patient variance on intercepts (again, in this case, the intercept is ending weight since MONTHS12=0 is end of treatment) is estimated to be 202.6194 (tau00), which corresponds to a standard deviation of 14.2344. The p-value shown tests the null hypothesis that ending weights for all patients are similar. The significant p-value (p<.001) indicates there is a significant amount of variation between patients on their ending weights. In other words, the variance is too big to assume it may be due only to

sampling error. We should continue to investigate factors that may account for this large between-patient variation in intercepts.

The between-patient variance in slopes (the effect of time, or MONTHS12, on weight) is estimated to be 0.4033 (tau10), which corresponds to a standard deviation of 0.6351. The p-value shown for this variance component tests the null hypothesis that the effect of time on weight is similar for all patients. The significant p-value here (p<.001) indicates there is a significant amount of variation between patients on this time effect. Significant variance in slopes denotes that the differences among patients in the effect of time on weight may also be further accounted for by additional factors.

Approximately 95 percent of patients from this population will have slopes in the range: -3.08 t*(0.64) ± ≈−3.08±1.28≈(-4.36, -1.80)

That is, the rate of weight loss per month will vary between a little less than 2 pounds to a little over 4 pounds.

Within-patient variance, sigma (variance of E), is 2.4821, showing little variation in weight around the “growth” line for each person.

In order to explore the possibility that certain patient characteristics might account for some of the between-patient variation in intercepts and slopes, we conducted an exploratory analysis of the Level-2 variables. The output below displays the results of this exploratory analysis.

Exploratory Analysis: estimated Level-2 coefficients and their standard errors obtained by regressing EB residuals on Level-2 predictors selected for possible inclusion in subsequent HLM runs --- Level-1 Coefficient Potential Level-2 Predictors

--- AGE30 MARITEND TOTSESS

INTRCPT1,B0

Coefficient 1.063 6.866 0.156 Standard Error 0.592 10.830 0.728 t value 1.796 0.634 0.214 AGE30 MARITEND TOTSESS MONTHS12,B1

Coefficient -0.030 -0.060 -0.039 Standard Error 0.029 0.482 0.027 t value -1.025 -0.124 -1.447

--- The t-values displayed do not offer much encouragement that Level-2 patient

In fact, further attempts at finding a better fitting model by including various patient characteristics (Level-2 variables) were not successful. In other words, no Level-2 variables in the data set could account for significant variation among patients in either the slopes or intercepts.

Because we could not find a better fit of the linear model, and we had suspected that weight loss might have followed a curvilinear trend, we repeated the HLM analysis this time including a quadratic term for time (MON12SQ) in the Level-1 equation.

**QUADRATIC MODEL **

In order to explore the fit of a curvilinear trend in the data, we started with the same model as the simple linear equations discussed above but included an additional variable in the Level-1 model. We included MON12SQ (previously computed in SPSS), the squared time term, uncentered as well. This resulted in a test of the model displayed below.

Summary of the model specified (in equation format) --- Level-1 Model Y = P0 + P1*(MONTHS12) + P2*(MON12SQ) + E Level-2 Model P0 = B00 + R0 P1 = B10 + R1 P2 = B20 + R2

The Level-1 equation states that a patient’s weight (*POUNDS*) is the sum of 4 quantities:

weight at the end of treatment, the rate of weight loss toward the end of treatment
(*MONTHS12*), the rate of change in this slope (*MON12SQ*), and an error term.

The Level-2 equations model the intercepts and slopes (without any patient characteristics) as:

P0 = The average ending weight for all patients (B00), plus an error term to allow each

patient to vary from this grand mean (R0). P0 is the intercept of the regression line

predicting weight from time.

P1= The average rate of change in weight per month (MONTHS12) for all patients (B10)

at the end of the study (near MONTHS12=0), plus an error term to allow each patient to vary from this grand mean effect (R1).

P2 = The average rate of change in slope for all patients (B20), plus an error term to allow

for variation (R2).

Final estimation of fixed effects:

--- Standard Approx.

Fixed Effect Coefficient Error T-ratio d.f. P-value --- For INTRCPT1, P0

INTRCPT2, B00 158.833791 5.321806 29.846 7 0.000 For MONTHS12 slope, P1

INTRCPT2, B10 -1.773039 0.358651 -4.944 7 0.001 For MON12SQ slope, P2

INTRCPT2, B20 0.108829 0.021467 5.070 7 0.001 --- The outcome variable is POUNDS

When we include the quadratic term, at MONTHS12=0 (end of treatment), the overall average weight for all patients is 158.8338 (B00).

The average rate of change in pounds per month near the end of the study (MONTHS12) is -1.7730 (B10); meaning that toward the end of treatment, for each month in the

program (1-unit increase in MONTHS12), weight decreases, on average, just less than 2 pounds. This decrease (effect) is statistically significant, as the p-value for B10 is less than .05.

The average rate of change in this slope is 0.1088 (B20). In other words, the slope (or
effect of time on weight loss) gets about 0.11 *less* steep per month. Patients lose the most

weight per month towards the beginning of treatment, but this effect flattens out as

treatment continues. The significant p-value (.001) indicates that this quadratic term adds an important piece to the prediction: There is, in fact, a curvilinear trend to be accounted for.

Next, we must examine the taus to determine if this average model fits suitably for all patients in the study.

Final estimation of variance components:

--- Random Effect Standard Variance df Chi-square P-value Deviation Component --- INTRCPT1,R0 14.98754 224.62629 7 814.54023 0.000 MONTHS12 slope,R1 0.85863 0.73725 7 23.99148 0.001 MON12SQ slope, R2 0.04247 0.00180 7 12.85100 0.075 Level-1, E 1.94136 3.76889 --- The between-patient variance on intercepts (ending weight) is estimated to be 224.6263 (tau00), which corresponds to a standard deviation of 14.9875.

The between-patient variance on slopes (the effect of time, or MONTHS12, on weight) is estimated to be 0.7373 (tau11), which corresponds to a standard deviation of 0.8586. The significant p-value here (p=.001) indicates there is a statistically significant amount of

variation between patients on this time effect. In other words, at the end of 12 months, some are losing weight faster than others: approximately 95% have slopes between -1.77±1.96*(.86) 1.77±1.72 (-3.49, -.05). Some are losing weight as fast as almost 3.5 pounds per month and others are losing almost nothing.

≈ ≈

The between-patient variance on change in slopes (how much the rate of change of slopes varies, MON12SQ) is estimated to be 0.0018 (tau22). This is NOT statistically

significant (p>.05), indicating we may not reject the null hypothesis that this curvilinear trend is the same across patients.

Within-patient standard deviation, or sigma (σ), is 1.9413, slightly smaller than before, showing that we have accounted for slightly more variation in weight.

In order to explore the possibility that certain patient characteristics might account for some of the significant between-patient variation in intercept (P0) and slope (P1), we

conducted an exploratory analysis of the potential contributions of Level-2 variables. The output below displays the results of this exploratory analysis.

Exploratory Analysis: estimated Level-2 coefficients and their standard errors obtained by regressing EB residuals on Level-2 predictors selected for possible inclusion in subsequent HLM runs --- Level-1 Coefficient Potential Level-2 Predictors

--- AGE30 MARITEND TOTSESS

INTRCPT1,B0

Coefficient 1.121 7.338 0.111 Standard Error 0.623 11.400 0.769 t value 1.799 0.644 0.144 AGE30 MARITEND TOTSESS MONTHS12,B1

Coefficient -0.001 0.197 -0.060 Standard Error 0.042 0.642 0.035 t value -0.019 0.307 -1.731

---Once again, the t-values displayed do not offer much encouragement that Level-2 patient characteristics will contribute anything significant to the prediction model.

And again, further attempts at finding a better fitting quadratic model by including

various patient characteristics (Level-2 variables) were not successful. In other words, no Level-2 variables added anything significant to the prediction.

**SUMMARY **

In the end, the best fitting model we could find included a quadratic term at Level-1 but no Level-2 predictors. This model is expressed by the following:

Level-1 Model Y = P0 + P1*(MONTHS12) + P2*(MON12SQ) Level-2 Model P0 = B00 + R0 P1 = B10 + R1 P2 = B20 + R2

with parameters estimated as:

Coefficient (and p-value) Variance component (and p-value)

Intercept B00 = 158.8338 Tau00 = 224.6465, p<.001** Slope (MONTHS12) B10 = -1.7730, p = .001** Tau11 = 0.7373, p = .001** Quad. Term

(MON12SQ)

B20 = 0.1088, p = .001** Tau22 = 0.0018, p = *ns*

The model for the average person (i.e., without error terms) is: Yij = 158.8338 – 1.7730*(MONTHS12) + 0.1088*(MON12SQ)

Theoretically, it makes sense that patients in this study would lose weight at a faster rate at the beginning of treatment (when they were heavier) and at a slower (or flatter) rate towards the end of the one-year treatment (when they were lighter).

Had we not explored the possibility of a quadratic term in the model, we would have instead used the average prediction equation, Yij = 156.4396 – 3.0790*(MONTHS12),

which assumes that weight loss (slope) was constant throughout treatment.

We can further verify the fit of the quadratic model over that of the linear model by visually examining the plots below.

Figure 2A. Actual data.
134.8
157.7
180.5
203.3
226.2
**P**
**O**
**U**
**NDS**
-12.60 -9.30 -6.00 -2.70 0.60
**MONTHS12**

Figure 2C. Quadratic Model Prediction

Figure 2B. Linear Model Prediction.

-12.00 -9.00 -6.00 -3.00 0
137.8
157.2
176.7
196.2
215.7
**MONTHS12**
**P**
**O**
**UNDS**
Yij = 156.4396 – 3.0790*(MONTHS12)
Yij = 158.8338 – 1.7730*(MONTHS12) + 0.1088*(MON12SQ)
-12.00 -9.00 -6.00 -3.00 0
138.7
158.8
179.0
199.1
219.2
**MONTHS12**
**P**
**O**
**UND**
**S**

We can see from these plots that the data (Figure 2A in the upper left-hand corner) seem slightly better fit by the quadratic model’s prediction beneath it (Figure 2C) than by the linear model’s prediction beside it (Figure 2B). This visual comparison gives us additional corroboration on selecting the quadratic model as the best fitting model.

**SECTION III: Two-Phase Designs, Outcome = Rate **

An additional published study was selected to serve as an example throughout this section of the handbook. This example is used to: (1) show how to analyze data from a two-phase (AB) study; and, (2) illustrate ways of dealing with a count as a dependent variable and related issues that may arise during analysis and interpretation.

**Dicarlo, C.F. & Reid, D.H. (2004). Increasing pretend toy play of **
**toddlers with disabilities in an inclusive setting. Journal of Applied ****Behavior Analysis, 37(2), 197-207. **

In this study, researchers observed the play behavior of five toddlers with disabilities. Observations took place in an inclusive classroom, over approximately forty 10-minute sessions, where the count of independent pretend toy play was tallied as the target behavior. The dependent variable in this dataset is a count, which must be

accommodated in the analyses. Such accommodations will be discussed below. There were two phases in this study. For the first 16 to 28 sessions (depending on the subject), children were observed without intervention (baseline phase). For the remaining sessions, children were prompted and praised for independent pretend-play actions. This was the responsive teaching program, or treatment phase. Data must be coded so that count variations within and across the two phases can be examined. Phase coding will be discussed below.

A series of line graphs, scanned and pasted from the original publication, plotting the count of play actions (Y) by session (X) for each subject, are presented below. For each, the data points to the left of the vertical line indicate observations made during the baseline phase and the points to the right of the vertical line indicate observations made during the treatment phase.

**Figure III.1: DiCarlo & Reid (2004). Count of play actions by session and phase for **
subjects 1-5.

Because the dependent variable is a count (how many independent pretend toy play actions were observed in each interval), we used a Poisson distribution when analyzing the data (instead of a normal distribution). The Poisson is often used to model the

number of events in a specific time period. We can see in the graphs above that across all subjects, in many sessions no pretend play actions were observed. In using a Poisson model, HLM will estimate the rate of behavior on a log scale; the log of 0 is negative infinity. This dependent variable zero trend, especially evident in the baseline phase, may then become a problem during analysis. More specific information about this problem and some potential ways of resolving it are discussed below.

The graphs also suggest that changes or trends in count over sessions may not be uniform across students and that the treatment effect (or the change in intercept from baseline to treatment phase) may vary across students. Particularly, it looks like subject 3 (“Kirk”, the graph in the lower left-hand corner of the image) may not follow the same pattern as the other children. We will examine this inconsistency via HLM analyses by creating a dummy variable for this subject, and entering that dummy variable into the equation for treatment effect. We will examine whether or not such exploration is warranted and how it might be performed.

Additional (Level-2) data were available on subjects’ chronological age and level of cognitive functioning. These variables had not been explored as potential explanatory factors in play action variation.

We aimed to use hierarchical linear modeling (HLM) to: (1) model the change in the count of play actions for each child, (2) combine results of all students in the study so that we may examine trends across the study and between students, and, (3) model the change

in play action counts between phases. Multiple observations on each individual were treated as nested within the subject.

Additionally, hierarchical linear modeling (HLM) will allow us to examine the

significance of student characteristics (including a dummy variable indicating whether the child was Kirk or not) that may account for variations in intercepts and slopes.

In order to perform such analyses and to simplify interpretation, several variables had to be recoded and/or created anew. Level-1 variable recodes and calculations include:

Phase was coded as *0* for baseline and *1* for treatment. (PHASE)

Session was recentered so that *0* represented the session right before the phase

change. (SESSIONC)

A variable for the session-by-phase interaction was computed by multiplying the 2 previous variables. (SESSxPHA = SESSIONC * PHASE)

Level-2 variables also needed to be recoded and/or created:

Cognitive age was centered around its approximate mean, so that a cognitive age
of *0* indicated a child of about average cognitive functioning for the sample.

(COGAGE15)

Chronological age had to be extracted from the text of the study, as it was not overtly offered as data. (CHRONAGE)

A dummy variable for Kirk (subject 3) was created so that subject 3 had a *1* on

this variable, and the remaining subjects had a *0*. (KIRKDUM)

Therefore, a *0* on all Level-1 variables (session, phase, session-by-phase interaction)

denotes the final baseline session. Intercepts for the computed models are then the predicted counts at the phase change. The full model (without any Level-2 predictors) is then:

Level-1:

Log (FREQRNDij) = P0 + P1*(SESSIONC) + P2*(PHASE) + P3*(SESSxPHA)

Level-2:

P0 = B00 + R0 P1 = B10 + R1 P2 = B20 + R2 P3 = B30 + R3

Details about how to create and/or transform the Level-1 and Level-2 variables are described below.

**Getting Data from Print into Ungraph **

**I. Scanning data to be read into Ungraph (via flatbed scanner):**

Graphs from Dicarlo & Reid (2004) were scanned and saved in the same manner as previously explained.

The diagram below displays the Dicarlo & Reid (2004) graphs as published.
* Example: *Scanned Dicarlo & Reid (2004) graphs for each student.

*Note: For each graph, you must decide if using Ungraph is worth your trouble. In this *
*case, reading and entering data manually would probably have taken us the same amount *
*of time as it did to use Ungraph to read in data (clicking on individual points, creating *
*individual data files, and then merging data files back together). Regardless, we review *
*this procedure, now with a scatterplot, for your consideration. *

**II. Defining graph space in Ungraph: **
Graph space was defined just as before:

Open the scanned image(s) in Ungraph as before (File Æ Open, scroll and zoom, etc.).

Define the Coordinate System as before (Edit Æ Define Coordinate System, etc.). Note: For this dataset, we defined the Y-axis scale as Count of Pretend Play Actions

per Session, *not* average actions per minute, as is used in the original graph.

Therefore, we multiplied the original Y-scale by 10 (as there were 10 minutes in each session) as we defined the coordinate system.

For example, when we clicked on the original Y=1.4 average actions/minute, we told Ungraph that it was actually 14 actions/session.

**III. Reading in & exporting data: **

Data from Dicarlo & Reid (2004) were read in differently than the Stuart (1967) data.

Instead of digitizing it as a line graph, we digitized it as a scatterplot.

(As well, in this case, data must be modified (e.g., rounded) in SPSS later on instead of immediately in Ungraph.)

Reading data from graph:

Select Digitize Æ New Scatter

**Carefully click on each data point in the graph to read in data **
Export Data just as before. (Select Data Æ Export) saving each subject’s data
separately.

Repeat EACH of these steps in sections II and III (from Defining Graph Space to Reading in and Exporting Data) for each Level-1 (subject) graph available. Save each of the Level-1 files as separate .txt files labeled by case name or ID

**Importing and Setting Up Data in SPSS **

**IV. Importing and Setting Up Level-1 Data in SPSS:**

Data is imported and set up in SPSS just as before EXCEPT where variable names/types differ:

Open SPSS program.

Read each text (.txt) file, one at a time, into SPSS as before, modifying variable titles as necessary.

Dataset should now be displayed in Data View screen. (Title/label variables as necessary in Variable View.)

Compute subject’s ID for data (COMPUTE subject=1, etc. in Syntax file). Save individual subject SPSS data files.

*Repeat steps 1 through 5* above for each subject in that study (for each of the text files

created from each of the 5 graphs scanned) creating separate Level-1 files for each subject.

Now that you have uniform SPSS files for each subject, you must merge them. Merge data files for each subject into one Level-1 file. (Select Data Æ Add cases, etc.)

Sort by subject ID.

In the merged file, you may wish to make additional modifications to the variables. As discussed above, for this dataset, we decided to make

modifications/transformations to the Level-1 data file with the syntax commands below:

First, we rounded SESSION (the X or time variable) to the nearest whole number, with the following syntax command:

**COMPUTE sessrnd = rnd(session). **
**EXECUTE. **

Then, for more meaningful HLM interpretation, we decided to transform SESSRND so that 0 represented the final session of the baseline phase. We did this by looking at the original graphs and noting when treatment started for each individual subject.

We then wrote and ran the following syntax command (the value subtracted from each subject’s SESSRND is the last session before the vertical line in the graph, indicating the phase change):

**If (subject=2) sessionc = sessrnd-19. **
**If (subject=3) sessionc = sessrnd-24. **
**If (subject=4) sessionc = sessrnd-27. **
**If (subject=5) sessionc = sessrnd-27. **
**EXECUTE. **

Next, we rounded FREQPLAY (the Y or dependent variable) to the nearest whole number, with the following syntax command:

**COMPUTE freqrnd = rnd(freqplay). **
**EXECUTE. **

*Note: We later checked these rounded counts against the data points on the *
*original graph to ensure accuracy. *

Fourth, we created a variable to indicate in which PHASE (baseline or
treatment) a measurement was taken by running the following syntax:
**COMPUTE phase = 0. **

**If (sessionc>0) phase = 1. **
**EXECUTE.**

Finally, we created an interaction variable so that we could later examine whether or not there were any significant interactions between session (SESSIONC) and phase (i.e., did slopes differ by phase):

**COMPUTE sessXpha = sessionc*phase. **
**EXECUTE. **

After making all modifications and making sure to sort by subject ID, re-save complete Level-1 file and close.

**V. Entering and Setting Up Level-2 Data in SPSS **

Create an SPSS file including any Level-2 data (subject characteristics) available, as before.

In this study, we had data on chronological age (CHRONAGE) and cognitive functioning age (COGAGE).

As well, once we began to examine the data more rigorously, we realized that subject #3 (Kirk) seemed to differ from all other subjects (see graphs for visual depiction). So, we decided to create a dummy variable to test this hypothesis. KIRKDUM was calculated as a 1 for Kirk and a 0 for all other subjects.

Transform any Level-2 variables as needed.

As discussed above, we centered or redefined COGAGE for more meaningful HLM interpretation. The average cognitive functioning age for this sample was around 15 months. In order to allow for simpler

interpretation, we computed cogage15 = cogage-15, so that cogage15=0 would represent a child of about average cognitive age.

Our Level-2 data file then was left with 4 working variables: SUBJECT, CHRONAGE, COGAGE15, and KIRKDUM.

**Setting Up and Running Models in HLM (Poisson) **

**VI. Setting up MDM file in HLM6:**The MDM file for the Dicarlo & Reid (2004) data was set up just as the Stuart (1967) MDM.

Open HLM program. (making sure all related SPSS files are saved and closed)
Select *File* Æ*Make new MDM file* Æ*Stat package input*

On the next window, leave HLM2 bubble selected and click *OK*.

Label MDM file (entering file name ending in .mdm, and indicate Input File Type as SPSS/Windows)

Specify structure of data (again, this data is nested within subjects so under *Nesting of *
*input data* we selected *measures within persons*)

Specify Level-1 data (browsing and opening Level-1 file and indicating subject ID and
all other relevant Level-1 variables – FREQRND, SESSIONC, PHASE, SESSxPHA)
Specify Level-2 data (browsing and opening Level-2 file and indicating subject ID and
all other relevant Level-2 variables – CHRONAGE, COGAGE15, and KIRKDUM)
Save Response File (clicking on *Save mdmt file, *naming and saving file*)*

Make MDM (clicking on *Make MDM*)

Check Stats (clicking *Check Stats*)

Click on *Done*.

**VII. Setting up the model: **

Because the dependent variable in this dataset is a count variable, there are several differences in how the HLM analyses were set up (e.g., estimation settings,

distribution of outcome variable, etc) in comparison to the Stuart (1967) data. With MDM file (just created) open in HLM,

Choose outcome variable:

With Level-1 menu selected, click on FREQRND and then *Outcome *
*variable* to specify the rounded count as the outcome measure.

Identify which Level-1 predictor variables you want in the model.
Click on SESSIONC and then *add variable uncentered*.

Repeat for PHASE and SESSxPHA.

Activate Error terms: Make sure to activate relevant error terms (depending on

model) in each Level-2 equation by clicking on the error terms individually (*r*_{1},*r*_{2},...).
In this case, we activated all Level-2 error terms.

**Modify model set up to accommodate dependent variable type: **

Because the dependent variable in this study is a count variable, we must indicate to HLM that this variable has a Poisson distribution with constant exposure1.

Select *Basic Settings *and choose *Poisson distribution (constant exposure)* under

Distribution of Outcome Variable

Title output and graphing files, while you are in *Basic Settings*:

Fill in *Title* (this is the title that will appear printed at the top of the output text

file).

Fill in *Output File Name* and location (this is the name and location where the

output file will be saved); and *Graph File Name* and location (this is the name and

location where the graph file will be saved).
Click *OK* to exit this screen.

1

*Exposure* is a term from survival analysis. In this context, it is the amount of time for each observed

session. Each was the same (10 minutes), so no special technique is needed; each count corresponds to the same rate for each subject.

**Example: Dicarlo & Reid (2004) – Setting up models in HLM **

**Estimation Settings: **

Select *Other Settings *Æ*Estimation Settings*

Under Type of Likelihood, select *Full maximum likelihood*

Click *OK* to exit this screen*.*

*Note: For a small number of subjects (as here, with n=5), Restricted maximum *
*likelihood (RML) is more accurate than Full maximum likelihood (FML) *
*(Raudenbush & Bryk, 2002, p. 53). But in order to compare model fits using *
*likelihoods, we must use FML (Raudenbush & Bryk, 2002, p. 60). *

Exploratory Analysis:

Select *Other Settings* Æ*Exploratory Analysis (Level-2)*

Click on each Level-2 variable that you want to include in the exploratory
analysis and click *add*. (In this case, we selected CHRONAGE, COGAGE15,

and KIRKDUM.).

Click on *Return to Model Mode* at top right of screen.

Run the analysis

At the top of the screen, click on *Run Analysis.*

On the pop-up screen, click on *Run model shown.*

View Output:

**Interpreting HLM Output **

Note on typographic conventions
Different fonts indicate different sources of information presented:

Where we present our own interpretation and discussion, we use the Times New Roman font, as seen here.

Where we present output from HLM, we use the Lucinda

Console font, as used in the HLM Output text files opened in Notepad, and as seen here.

We first look at estimates produced from the analysis of the simple model presented in the preceding pages, including all session slope terms but excluding all Level-2

predictors. We focus on the estimates in the output section labeled “Unit Specific

Model” to examine how changes in subject characteristics can affect a subject’s expected outcome.

**1. Simple Non-Linear Model with Slopes **

After setting up the MDM file, we identified FREQRND as the outcome variable and directed HLM to include SESSIONC and PHASE (computed previously in SPSS) in the model. We also directed HLM to model the errors as a Poisson distribution, since the dependent variable is a count. In using a Poisson distribution, HLM estimates produced are on a log scale. (See Introduction for more information about this decision.) This resulted in a test of the model(s) displayed below. These equations are from the HLM output and omit subscripts for observations and individuals.

Summary of the model specified (in equation format) --- Level-1 Model

E(Y|B) = L

V(Y|B) = L

log[L] = P0 + P1*(SESSIONC) + P2*(PHASE) + P3*(SESSXPHA) Level-2 Model

P0 = B00 + R0 P1 = B10 + R1 P2 = B20 + R2 P3 = B30 + R3

The Level-1 equation above states that the logarithm of FREQRND (or the rounded expected count of independent play actions) is the sum of 4 parts: the count at the intercept (in this case, the final baseline session), plus a term accounting for the rate of change in count with time (SESSIONC), plus a term accounting for the rate of change in count with phase change (PHASE), plus an interaction term allowing the rate of change in count with time to differ across phases (SESSxPHA).

This simple model does not include any Level-2 predictors (student characteristics). The Level-2 equations model the level 1 parameters as:

P0 = The average log count at final baseline session for all subjects (B00), plus an error

term to allow each student to vary from this grand mean (R0).

P1 = The average rate of change in log count per session (SESSIONC) during the baseline

phase, for the 5 participants (B10), plus an error term to allow each student to vary
from this grand mean effect (R1). *Note: Remember that SESSIONC was recoded so *
*that 0=last baseline session *

P2 = The average rate of change in log count as a subject switches from baseline to

treatment phase (PHASE) for all students (B20), plus an error term to allow each student to vary from this grand mean (R2).

P3 = The average change in session effect (i.e., time slope) as a subject switches from

baseline to treatment phase for all students (B30), plus an error term to allow each student to vary from this grand mean (R3).

The following estimates were produced by HLM for this model:

Final estimation of fixed effects: (Unit-specific model)

--- Standard Approx.

Fixed Effect Coefficient Error T-ratio d.f. P-value --- For INTRCPT1,P0

INTRCPT2,B00 -1.383793 0.694566 -1.992 4 0.114 For SESSIONC slope,P1

INTRCPT2,B10 -0.028647 0.036734 -0.780 4 0.479 For PHASE slope,P2

INTRCPT2,B20 2.668051 0.439396 6.072 4 0.000 For SESSXPHA slope,P3

INTRCPT2,B30 0.060713 0.029853 2.034 4 0.109 ---When SESSIONC=0 and PHASE=0 and SESSXPHA=0 (i.e., the final baseline phase), the overall average log count of independent play actions for all students is -1.3839 (B00). [exp(-1.3838) = 0.2506] The average number of observed independent pretend play actions during the final session in phase 1 (baseline) is 0.252.

*Remember that we redefined the scale of the dependent variable when we defined the *
*Y-axis in Ungraph to represent actual counts, which is appropriate for use with a Poisson *
*distribution. Our dependent measure then indicates the number of play actions per *
*10-minute session, not per 10-minute*.

2_{ They p-value for this coefficient indicates whether the estimate of B}

00 (the baseline intercept) is significantly different from 0. This is not a hypothesis test of interest for this study.

The average rate of change in log count per session is -0.0286 (B10). This increase is not significant as the p-value for B10 is greater than .05. Conclusion: Baseline is flat, not changing over time (sessions).

The average rate of change in log count as a student switches from baseline (PHASE=0) to treatment phase (PHASE=1) is 2.6680 (B20). This phase effect is significant as the p-value for B20 is less than .05 (or even .01). [exp(-1.3838 + 2.6680) = exp(1.2842) = 3.6118] The average number of observed independent pretend play actions per session during phase 2 (treatment) is 3.61.

Lastly, the average interaction effect, or change in session effect between phases (B30), is exp(0.0607). This interaction effect is not significant. Therefore, the treatment phase is predicted to be flat (not changing over time) as well.

Because neither of the session slope terms contributed anything significant to prediction, we decided to further simplify the model, re-running it without these terms. This round of analysis is presented next.

**2. Simple Non-Linear Model without Slopes **

Procedures for setting up this model are congruent to the last, except for our deletion of SESSIONC and SESSxPHA from the equation. Output from this analysis is displayed below.

Summary of the model specified (in equation format) --- Level-1 Model E(Y|B) = L V(Y|B) = L log[L] = P0 + P1*(PHASE) Level-2 Model P0 = B00 + R0 P1 = B10 + R1

Without the session slope terms in the equation, interpretation is simplified.

The Level-1 equation above states that the logarithm of FREQRND (or the rounded count of independent play actions) is the sum of 2 parts: the count at the intercept (in this case, the final baseline session), plus a term accounting for the rate of change in count with a phase change.

The Level-2 equations model the level 1 effects as:

P0 = The average log count at final baseline session for all subjects (B00), plus an error

term to allow each student to vary from this grand mean (R0). (Note: Since the slope in the baseline phase was found to be flat, this intercept estimate is the average expected log count for the baseline phase.)

P1 = The average rate of change in log count as a subject switches from baseline

(PHASE=0) to treatment phase (PHASE=1) for all students (B10), plus an error term to allow each student to vary from this grand mean (R1).

The following estimates were produced by HLM for this model:

Final estimation of fixed effects: (Unit-specific model)

--- Standard Approx.

Fixed Effect Coefficient Error T-ratio d.f. P-value --- For INTRCPT1,P0

INTRCPT2,B00 -0.800345 0.604807 -1.323 4 0.256 For PHASE slope,P1

INTRCPT2,B10 2.555173 0.289583 8.824 4 0.000 --- The resulting equation for predicting the average log count of independent play actions is then:

Log(Y) = -0.80 + 2.56*(PHASE)

When PHASE=0 (Baseline), then the PHASE term drops out and the expected log count is: -0.80. Translated to a count scale, the expected average count of play actions per session during the baseline phase is exp(-0.8003) = 0.45.

When PHASE=1 (Treatment), then the expected log count is: (-0.80 + 2.56) or 1.76. Translated to a count scale, the expected average count of play actions per session during the treatment phase is exp(1.76) =5.78.

Estimates of the variance components for this model (tau00 and tau11, the variances of R0 and R1) indicate that there may be significant between-subject variation in estimates of the intercept:

Final estimation of variance components:

--- Random Effect Standard Variance df Chi-square P-value Deviation Component

--- INTRCPT1, R0 1.27206 1.61814 4 27.14321 0.000 PHASE slope, R1 0.44660 0.19945 4 3.87652 >.500 ---