Moderator and Mediator Analysis
Marijtje van Duijn
October 18, 2011 Seminar General Statistics
Overview
• What is moderation and mediation?
• What is their relation to statistical concepts?
• Example(s)
October 18, 2011 Moderator and mediator analysis 3
Mediation and Moderation
X
1X
2Y
Examples
• Y : test score
• X1 : sex, SES, etc.
• X2 : ability (IQ score)
• Y : test score
• X1 : brain volume, SES parents
• X2 : ability
October 18, 2011 Moderator and mediator analysis 5
Multiple regression
• Goal: to explain variation in Y using Xs
• Assumptions
– Independent observations
– Normality (of residuals) and constant variance
– Linearity (of relationship Y and Xs)
i i
i
i
X X E
Y = β0 + β1 1 + β2 2 +
+ β2 2 +
Regression Model
X
1X
2β1
Y
β2
E
October 18, 2011 Moderator and mediator analysis 7
Explained variance in regression
Y
X1 X2
Circles represent variances
X1and X2explain different parts of Y and are independent
Multicollinearity
Y
X2 X1
‘competition’ between variables for explaining Y
Degree depends on correlation between X’s
October 18, 2011 Moderator and mediator analysis 9
Association between all (3) variables
• Several correlations to describe association
– Partial correlation: correlation between two variables with the third variable fixed (i.e. corrected for the third variable) (partial in SPSS) ‘unique portion of variance’
– Semi-partial correlation: correlation between two variables with one variable’s association corrected for the third variable (part in SPSS output) ‘extra variance explained’
• Important in regression when variables are entered in a certain order
• Usually the partial correlation is smaller than the uncorrected (‘zero-order’) correlation, and larger than the semi-partial correlation
Multicollinearity/Mediation
X
1X
2Y
October 18, 2011 Moderator and mediator analysis 11
Mediation as a special case of multicollinearity and/or model selection
• Causal model defines direction of arrows
• X2 is the mediator (M)
– Also: intervening or process variable – Or: indirect causal relationship
• Relations between all variables are assumed to be positive
• Question is whether direct effect between X1
and Y disappears when M is added to the regression equation
Mediation
(Baron & Kenny, 1986),
http://davidakenny.net/cm/mediate.htm)
X Y
c
b a
c’
M
October 18, 2011 Moderator and mediator analysis 13
Mediation as ‘prescribed’ by Baron and Kenny (1986)
• Estimate regression of Y on only X1
– Estimated parameter c
• Estimate regression of M on X1
– Estimated parameter a
• Estimate effect of M on Y, together with X1
– Estimated parameters b and c’
• Complete mediation: c’=0
• Partial mediation: c’< c (can be tested)
Testing of change in c
• ‘Amount of mediation’ c-c’
• Theoretically equal to ab (indirect path)
• Standard error of ab is approximately square root of
b2sa2+ a2sb2 (Sobel test)
see (do) http://quantpsy.org/sobel/sobel.htm Note: neither c nor c’ are needed!
• Some eye-balling also possible
• Or: nonparametric tests (based on bootstrapping)
October 18, 2011 Moderator and mediator analysis 15
Example (Miles and Shevlin)
• Y: read, a measure of the number of books that people have read.
• X: enjoy, scale score to measure how much people enjoy reading books
• M: buy, a measure of how many books people have bought in the previous 12 months
Idea: how much people enjoy reading books -> the number of books bought -> the number of books read But: is the number of books a complete mediator….?
(People could go to the library or borrow books from friends.)
Descriptive Statistics Mean
Std.
Deviation N
read 8,85 3,563 40
buy 15,73 8,165 40
enjoy 9,28 5,354 40
Correlations
read buy enjoy
Pearson Correlation
read 1,000 ,747 ,732
buy ,747 1,000 ,644
enjoy ,732 ,644 1,000
Coefficientsa Model
Unstandardized Coefficients
Standardized Coefficients
t Sig.
B Std. Error Beta
1 (Constant) 4,331 ,785 5,517 ,000
enjoy ,487 ,074 ,732 6,625 ,000
a. Dependent Variable: read
Step 1
October 18, 2011 Moderator and mediator analysis 17
Step 2
Coefficientsa Model
Unstandardized Coefficients
Standardiz ed Coefficients
t Sig.
B Std. Error Beta
1 (Constant) 6,616 2,020 3,274 ,002
enjoy ,982 ,189 ,644 5,190 ,000
a. Dependent Variable: buy
Step 3
Coefficientsa Model
Unstandardized Coefficients
Standardized Coefficients
t Sig.
Correlations
B Std. Error Beta
Zero-
order Partial Part
1 (Constant) 2,973 ,765 3,887 ,000
buy ,205 ,054 ,471 3,786 ,001 ,747 ,528 ,360
enjoy ,286 ,083 ,429 3,452 ,001 ,732 ,494 ,328
a. Dependent Variable: read
Examples
• Baron/Kenny + Sobel
– a=.982 sa=.189 – b=.487 sb=.074
– ab=.478, sab=√(.4872*.1892+.9822*.0742)= .12 – z-test 4.08; p<0.001
• Eye-balling
– c=.487 (.074), then roughly .34 < c < .64
– c’=.286 (.083): partial mediation (because .12 < c’ < .45)
October 18, 2011 Moderator and mediator analysis 19
Model choice is important
• Many other patterns of association are possible
– Arrows between X1 and X2 may be
reversed – not always clear which variable
‘mediates’
– No causal relation, just association
• Explicit assumption of positive
associations and ordering of (semi-) partial correlations. Not guaranteed…
Moderation
X
1X
2Y
October 18, 2011 Moderator and mediator analysis 21
Moderation is ‘interaction’
• The effect of X2 depends on the value of X1
(or vice versa)
– ‘different slopes for different folks’
– can be a way to tackle non-linearity in regression
• Interpretation depends on type of variable
• Interaction usually (but not necessarily) product of X1 and X2
• X1 and X2 may be correlated
• Importance of model formulation
Regression model
• Centering is (usually) advised – Facilitates interpretation
– Reduces the inevitable multicollinearity incurred with interaction terms
i i i i
i
i X X X X E
Y =β0+β1 1 +β2 2 +β3 1 ⋅ 2 +
i c i c i c
i c
i
i X X X X E
Y = β0 + β1 1 + β2 2 +β3 1 ⋅ 2 +
October 18, 2011 Moderator and mediator analysis 23
X
1dichotomous and X
2continuous
• X1 = 0or X1 = 1
– (e.g. man/woman; control/experimental group)
• Y = β0+ β1X1+ β2X2+ β3X1X2 – X1 = 0: Y = β0 + β2X2
– X1 = 1: Y = (β0+ β1) + (β2+β3)X2
• so: intercept and regressioneffect of X2change
– interaction effect represents the change in effect of X2 or, the difference in the effect of X2between the two groups – interpretation of β1
– General formula: Y = (β0+ β1X1) + (β2+ β3X1)X2
X
1and X
2dichotomous
• X1 = 0or X1= 1(e.g. man/woman)
• X2 = 0or X2= 1(e.g. control/experimental grp)
• Y = β0+ β1X1+ β2X2+ β3 X1X2
– X1 = 0, X2 = 0 : Y = β0 – X1 = 1, X2 = 0 : Y = β0 + β1 – X1 = 0, X2 = 1 : Y = β0 + β2
– X1 = 1, X2 = 1 : Y = β0 + β1+ β2+ β3
• so: defines four groups with their own mean
• Interaction defines ‘extra’ effect of X =1 and X =1
October 18, 2011 Moderator and mediator analysis 25
X
1nominal and X
2continuous
• X1 takes on more than 2 (c) values – (e.g. age groups, control/exp1/exp2)
• Make dummies, for each contrast, wrt 1 reference group (e.g. controls)
– Leads to c-1 dichotomous variables – And also c-1 interaction terms
• Also possible: dummies (indicators) for each group leaving out the intercept
c=3; group 1 reference group
• group D1 D2
1 0 0
2 1 0
3 0 1
• Y = β0 + β1dD1 + β2dD2 + β2X2 +β3D1X2 +β4 D2X2 – group1: Y = β0 + β2X2
– group2: Y = (β0 + β1d) + (β2+β3)X2 – group3: Y = (β0 + β2d) + (β2+β4)X2
October 18, 2011 Moderator and mediator analysis 27
X
1and X
2continuous
Y = β0+ β1X1+β2X2+β3X1X2⇔ Y = (β0+ β1X1) + (β2+β3X1)X2⇔
Y = (β0 +β2X2) + (β1+β3X2)X1
X1 β3 β2
* β2
X2 β3 β1
* β1
X2 X1 β3 X2 β2 X1 β1 β0
* β0
2c 1c 3 2c
* 1c
*
* 2 2 2c
1 1 1c
X X β X β X β β Y
X X X
X X X
2 1 0
−
=
−
=
+
−
−
=
+ +
+
=
−
=
−
=
1c
*
* 2
2
2c
*
* 1
1
X β β Y : X X
X β β Y : X X
1 0
2 0
+
=
=
+
=
=
Example (Miles and Shevlin)
• Y: grade in statistics course
• X1: number of books read (0-4)
• X2: number of classes attended (0-20)
Descriptive Statistics Mean Std. Deviation N
grade 63,5500 16,70552 40
books 2,00 1,432 40
attend 14,10 4,278 40
October 18, 2011 Moderator and mediator analysis 29
Coefficientsa Model
Unstandardized Coefficients
Standardiz ed Coefficient
s
t Sig.
Correlations
Collinearity Statistics B Std. Error Beta
Zero-
order Partial Part Toleran
ce VIF
1 (Constant) 63,422 2,223 28,534 ,000
booksc 4,037 1,753 ,346 2,303 ,027 ,492 ,354 ,310 ,803 1,245
attendc 1,283 ,587 ,329 2,187 ,035 ,482 ,338 ,295 ,803 1,245
2 (Constant) 61,469 2,320 26,494 ,000
booksc 4,081 1,677 ,350 2,433 ,020 ,492 ,376 ,314 ,803 1,245
attendc 1,333 ,562 ,341 2,372 ,023 ,482 ,368 ,306 ,802 1,247
bookscxatte ndc
,735 ,349 ,271 2,104 ,042 ,241 ,331 ,271 ,997 1,003
a. Dependent Variable: grade
October 18, 2011 Moderator and mediator analysis 31
Other example
(missed) interaction:
non-linearity?
3 groups with each a distinct linear relation
October 18, 2011 Moderator and mediator analysis 33
Model Summaryb
.405a .164 .134 2.57780
Model 1
R R Square
Adjusted R Square
Std. Error of the Estimate
Predictors: (Constant), x a.
Dependent Variable: y b.
Coefficientsa
5.724 1.029 5.560 .000
.163 .070 .405 2.341 .027 .405 .405 .405 1.000 1.000
(Constant) x Model 1
B Std. Error Unstandardized
Coefficients
Beta Standardized
Coefficients
t Sig. Zero-order Partial Part
Correlations
Tolerance VIF Collinearity Statistics
Dependent Variable: y a.
Residuals Statisticsa
5.8864 9.7927 7.8667 1.12042 30
-4.88636 3.16046 .00000 2.53297 30
-1.767 1.719 .000 1.000 30
-1.896 1.226 .000 .983 30
Predicted Value Residual Std. Predicted Value Std. Residual
Minimum Maximum Mean Std. Deviation N
Dependent Variable: y a.
2 0
-2
Regression Standardized Residual 6
4
2
0
Frequency
Mean =-2,78E-17 Std. Dev. =0,983
N =30 Histogram
Dependent Variable: y
1,0 0,8 0,6 0,4 0,2 0,0
Observed Cum Prob 1,0
0,8
0,6
0,4
0,2
0,0
Expected Cum Prob
Normal P-P Plot of Regression Standardized Residual
Dependent Variable: y
October 18, 2011 Moderator and mediator analysis 35
Model Summaryb
.980a .960 .957 .57477
Model 1
R R Square
Adjusted R Square
Std. Error of the Estimate
Predictors: (Constant), xkwadraat, x a.
Dependent Variable: y b.
Coefficientsa
8.512 .259 32.839 .000
.140 .016 .349 9.042 .000 .405 .867 .348 .996 1.004
-.054 .002 -.894 -23.156 .000 -.916 -.976 -.892 .996 1.004
(Constant) x xkwadraat Model 1
B Std. Error Unstandardized
Coefficients
Beta Standardized
Coefficients
t Sig. Zero-order Partial Part
Correlations
Tolerance VIF Collinearity Statistics
Dependent Variable: y a.
Residuals Statisticsa
.5888 10.4412 7.8667 2.71361 30
-.82773 1.16910 .00000 .55460 30
-2.682 .949 .000 1.000 30
-1.440 2.034 .000 .965 30
Predicted Value Residual Std. Predicted Value Std. Residual
Minimum Maximum Mean Std. Deviation N
Dependent Variable: y a.
2 0
-2
Regression Standardized Residual 6
4
2
0
Frequency
Mean =-3,61E-16 Std. Dev. =0,965
N =30 Histogram
Dependent Variable: y
15,00 10,00 5,00 0,00 -5,00 -10,00 -15,00
x 2,00
0,00
-2,00
y
Partial Regression Plot
Dependent Variable: y
Model Summaryb
.410a .168 .107 2.61796
Model 1
R R Square
Adjusted R Square
Std. Error of the Estimate
Predictors: (Constant), groep, x a.
Dependent Variable: y b.
Coefficientsa
6.021 1.301 4.629 .000
.220 .166 .548 1.329 .195 .405 .248 .233 .181 5.515
-.528 1.375 -.158 -.384 .704 .337 -.074 -.067 .181 5.515
(Constant) x groep Model 1
B Std. Error Unstandardized
Coefficients
Beta Standardized
Coefficients
t Sig. Zero-order Partial Part
Correlations
Tolerance VIF Collinearity Statistics
Dependent Variable: y a.
Residuals Statisticsa
5.7131 9.9467 7.8667 1.13588 30
-4.71313 3.17007 .00000 2.52607 30
-1.896 1.831 .000 1.000 30
-1.800 1.211 .000 .965 30
Predicted Value Residual Std. Predicted Value Std. Residual
Minimum Maximum Mean Std. Deviation N
Dependent Variable: y a.
6
4
2
Frequency
Histogram
Dependent Variable: y
2,50
0,00
-2,50
-5,00
y
Partial Regression Plot
Dependent Variable: y
October 18, 2011 Moderator and mediator analysis 37
Model Summaryb
.750a .562 .512 1.93502
Model 1
R R Square
Adjusted R Square
Std. Error of the Estimate
Predictors: (Constant), groep3, groep2, x a.
Dependent Variable: y b.
Coefficientsa
4.556 .912 4.994 .000
.172 .123 .427 1.396 .174 .405 .264 .181 .180 5.552
3.476 1.310 .602 2.653 .013 .645 .462 .344 .327 3.057
-.326 2.038 -.056 -.160 .874 -.030 -.031 -.021 .135 7.394
(Constant) x groep2 groep3 Model 1
B Std. Error Unstandardized
Coefficients
Beta Standardized
Coefficients
t Sig. Zero-order Partial Part
Correlations
Tolerance VIF Collinearity Statistics
Dependent Variable: y a.
Residuals Statisticsa
4.7273 11.1227 7.8667 2.07709 30
-3.72727 3.72727 .00000 1.83220 30
-1.511 1.568 .000 1.000 30
-1.926 1.926 .000 .947 30
Predicted Value Residual Std. Predicted Value Std. Residual
Minimum Maximum Mean Std. Deviation N
Dependent Variable: y a.
2 0
-2
Regression Standardized Residual 8
6
4
2
0
Frequency
Mean =3,47E-16 Std. Dev. =0,947 N =30 Histogram
Dependent Variable: y
4,00 2,00 0,00 -2,00 -4,00
x 4,00
2,00
0,00
-2,00
-4,00
y
Partial Regression Plot
Dependent Variable: y
Model Summaryb
.997a .993 .992 .25050
Model 1
R R Square
Adjusted R Square
Std. Error of the Estimate
Predictors: (Constant), intxgr3, intxgr2, x, groep2, groep3
a.
Dependent Variable: y b.
Coefficientsa
4.97E-014 .171 .000 1.000
1.000 .028 2.485 36.259 .000 .405 .991 .609 .060 16.657
10.145 .417 1.756 24.309 .000 .645 .980 .408 .054 18.505
18.000 .596 3.116 30.201 .000 -.030 .987 .507 .026 37.737
-.985 .039 -2.378 -25.250 .000 .626 -.982 -.424 .032 31.455
-1.500 .039 -5.401 -38.458 .000 -.081 -.992 -.646 .014 69.919
(Constant) x groep2 groep3 intxgr2 intxgr3 Model 1
B Std. Error Unstandardized
Coefficients
Beta Standardized
Coefficients
t Sig. Zero-order Partial Part
Correlations
Tolerance VIF Collinearity Statistics
Dependent Variable: y a.
Residuals Statisticsa
1.0000 10.4182 7.8667 2.76031 30
-.41818 .65758 .00000 .22789 30
-2.488 .924 .000 1.000 30
-1.669 2.625 .000 .910 30
Predicted Value Residual Std. Predicted Value Std. Residual
Minimum Maximum Mean Std. Deviation N
Dependent Variable: y a.
2 0 -2
Regression Standardized Residual 12
10
8
6
4
2
0
Frequency
Mean =2,81E-15 Std. Dev. =0,91 N =30 Histogram
Dependent Variable: y
4,00 2,00 0,00 -2,00 -4,00
x 4,00
2,00
0,00
-2,00
-4,00
y
Partial Regression Plot
Dependent Variable: y
October 18, 2011 Moderator and mediator analysis 39
Model choice important
• Missing interaction may result in – Violations of linearity assumption – non-constant variance (heterogeneity) – Incorrect model choice and interpretation
Conclusion
• Model choice and selection crucial in detecting mediation and moderation
• Substantive / theoretical considerations should guide the model selection
process!
October 18, 2011 Moderator and mediator analysis 41
Other methods/models
• Mediation sometimes – too – simple – More refined path analysis
• Regression analysis sometimes – too – simple
• More elaborate models for causal modeling
– Structural Equation Models