Applied Panel Data Analysis
Using Stata
Prof. Dr. Josef Brüderl
LMU München
Contents I
I) Panel Data
06II) The Basic Idea of Panel Data Analysis
17III) An Intuitive Introduction to Linear Panel Regression
– A Didactic Example 21
– Within Estimation (Fixed-Effects Regression) 30 – Within Estimation With Control Group 47
IV) The Basics of Linear Panel Regression
– Linear Panel Models 51
– Fixed- or Random-Effects? 63
V) A Real Data Example: Marriage and Happiness
– Preparing Panel Data 70
– Describing Panel Data 78
– The Results 81
Contents II
VI) Modeling Individual Growth
– Growth Curve Models 117
– The Age-Period-Cohort Problem 129 – Group Specific Growth Curves 139
VII) Further Linear Panel Models
– Alternative Within Estimators 147 – The Fixed-Effects Individual-Slopes Model 155 – Mixed-Coefficients Panel Models 174
– The Hybrid Model 188
– Dynamic Panel Models 197
– Comparing Models by Simulations 205 – Panel Regression with Missing Data 210
Contents III
VIII) Non-Linear Panel Models: Fixed-Effects Logit
218IX) Event History Analysis with Repeated Events
225– Example: Duration of Unemployment
X) Limitations of the Within Methodology
– Limitations of Scope 237
– Violations of Strict Exogeneity 243 – Consequences of Panel Attrition 256 – Consequences of Panel Conditioning 260
– Direction of Causality 262
XI) Final Remarks on Panel Analysis
– Panel Regression and Causal Inference 267 – The FE Revolution in Social Research 274 – Some Critical Remarks on the Literature 284
What This Lecture Aims For
• Introducing basic methods of panel data analysis (PDA)
– Emphasis on fixed-effects and growth curve methods – Complex methods are de-emphasized
• Practical implementation of PDA with Stata
– Important Stata commands are on the slides
– The lecture is accompanied by Stata do-files (and data), whereby all computations can be reproduced
• Presenting and interpreting results
– The graphical display of regression results is emphasized
- The era of the regression table is over!
• Teaching materials can be found here
– Stata Do-Files, Stata Commands for Longitudinal Analysis
– www.ls3.soziologie.uni-muenchen.de/teach-materials/index.html
Chapter I: Panel Data
Josef Brüderl
Josef Brüderl, Panel Analysis, April 2015
Hierarchy of Data Structures
Time State
employed unemployed
• Cross-sectional data
– “Snapshot” at one time point
• Panel data
– Repeated measurement
• Event history data
– Information on the complete life course
The life courses of three individuals
Panel Data
Repeated measurements of the same variables on the same units
Macroeconomics, Political Science - Unit of analysis: countries
- N small, T large
Cross-sectional time series (xt)
Microeconomics, Sociology - Unit of analysis: persons - N large, T small
(Micro) panel data
This lecture emphasizes micro panel data analysis id time Y X 1 1 1 2 2 1 2 2 ⋮ ⋮ ⋮ ⋮ N 1 N 2 Notation: 1, … , : units 1, … , : time
Example: panel data with 2
These are:
- balanced panel data - in long format (pooled)
If units are persons and time is years:
Josef Brüderl, Panel Analysis, April 2015
The Two Major Advantages of Panel Data
• Panel data allow to identify causal effects under weaker
assumptions (compared to cross-sectional data)
– With panel data we know the time-ordering of events
– Thus we can investigate how an event changes the outcome
• Panel data allow to study individual trajectories
– Individual growth curves (e.g. wage, materialism, intelligence)
- One can distinguish cohort and age effects
– Transitions into and out of states (e.g. poverty)
Usage of Panel Data is on the Rise
Number of publications using the SOEP
Source: Schupp, J. (2009) 25 Jahre Sozio-oekonomisches Panel. ZfS 38: 350-357.
According to Young/Johnson (2015) 61% of all empirical articles published in JMF 2010-2014 used panel data. Methods used:
19% event history methods 16% linear regression 19% fixed effects models 15% logistic regression
Josef Brüderl, Panel Analysis, April 2015
A Few Remarks on Collecting Panel Data
• Cross-sectional survey: retrospective questions
– Problems with recall
– Often done for collecting event history data
• Prospective panel survey
– Panel data
- Ask for the current status/value
– Event history data
- Ask what happened since last interview: between wave retrospective questions (electronic life-history calendar)
- Ideally using dependent interviewing (preloads) to avoid the seam effect
• The advantages of panel data are threatened by two
methodological problems (s. Chapter X)
- Panel conditioning (panel effect) - Panel mortality (attrition)
• More on panel methodology can be found in Lynn (2009)
Important Panel Surveys
• Household panels
– Panel Study of Income Dynamics (PSID) [since 1968]
- The role model for all household panels
– German Socio-Economic Panel (SOEP) [since 1984] – Understanding Society (UKHLS) [since 1991]
• Cohort panels
– British Cohort Studies: children born 1958, 1970, 2000
– National Longitudinal Survey of Youth (NLSY79): U.S. cohort born around 1960
• Panels on special populations in Germany recently started
– German Family Panel (pairfam), National Educational Panel Study (NEPS), Survey of Health, Ageing and Retirement in Europe (SHARE),
Panel "Arbeitsmarkt und soziale Sicherung" (PASS), TwinLife, Children of Immigrants (CILS4EU), Nationale Kohorte
• Online panel surveys
– LISS panel: A Dutch online panel survey – German internet panel (GIP)
– GESIS Panel
• Links on German studies you can find here:
SOEP
Household Panel Study
• Sample of households in Germany
• Every person aged 17 or older is interviewed
• For persons under 17 proxies are interviewed
• When a person moves out of the household, he or she is
followed
• Persons, households and original households can be
identified beyond waves
• First wave 1984 (subsamples A and B)
• Annual interviews (PAPI questionnaire)
• Several refreshment subsamples
– Meanwhile about 60,000 persons participated in the SOEP
More information:
http://www.diw.de/soep
The German Family Panel (pairfam)
• Target population
– All German residents, who are able to do an interview in German – Cohort-sequence design: 1971-73, 1981-83, 1991-93
• Sample
– Random sample from population registers
- 343 “Gemeinden” were sampled
- 42.000 addresses were drawn randomly from the population registers
– Response rate in wave 1: 37 %
• Interview mode
– 60 minute CAPI (some parts CASI)
• Multi-actor design
– Anchor person (AP) and partner, parents, children
• First wave in 2008
– Waves annually, currently wave 7 is in the field
– Non-monotonic design: respondents can drop out for one wave
• Data: currently version 5.0 is available
Response Rate Anchor – Panel Stability
Chapter II:
The Basic Idea of Panel Data Analysis
Josef Brüderl
Paul Lazarsfeld on Panel Data Analysis
Princeton „radio project“ (1937-1939)
– Research question
Effect of radio ownership on political attitudes: Will the Americans become communist?
– Inference from cross-sectional (control group) or panel data?
“Most of the control groups available for social research are ‘self-selected’.”
“If we give radios to a number of farmers and then notice considerable differences without any great external changes occurring at the same time, it is safer to assume that these differences are caused by radio than it would be, if we were to compare radio owners with non-owners.”
Lazarsfeld/Fiske (1938) The “panel” as a new tool for measuring opinion. Public Opinion Quarterly 2: 596-612.
Josef Brüderl, Panel Analysis, April 2015
The Basic Approach
• According to the counterfactual approach to causality (Rubin‘s
model) an individual causal effect is defined as
Δ , , , : treatment, : control
– However, this is not estimable (fundamental problem of causal inference)
• Estimation with cross-sectional data
Δ , ,
– We compare different persons and (between estimation) – Assumption: unit homogeneity (no unobserved heterogeneity)
• Estimation with panel data I
Δ , ,
– We compare the same person over time and (within estimation) – Assumption: temporal homogeneity (no period effects, no maturation)
• Estimation with panel data II
Δ , , , ,
– Within estimation with control group – Assumption: parallel trends
The Basic Approach
• Between estimation works well with experimental data
– Due to randomization units will differ only in the treatment
• However, with observational data between estimation
generally will not work, because the strong assumption of
unit homogeneity will not hold
– Due to self-selection into treatment
– Unobserved unit heterogeneity will bias between estimation results
• Within estimation with control group, however, will often
work, because the parallel trends assumption is much
weaker
– Unobserved unit heterogeneity will not bias within estimation results
– Only differing time-trends in treatment and control group will bias within estimation results
Chapter III:
An Intuitive Introduction to Linear Panel
Regression
Josef Brüderl
Applied Panel Data Analysis
Is There a Marital Wage Premium for Men?
• Fabricated data: long-format
. list id time wage marr, separator(6)
--- ---| id time wage marr | | id time wage marr | |---| |---| 1. | 1 1 1000 0 | 13. | 3 1 2900 0 | 2. | 1 2 1050 0 | 14. | 3 2 3000 0 | 3. | 1 3 950 0 | 15. | 3 3 3100 0 | 4. | 1 4 1000 0 | 16. | 3 4 3500 1 | 5. | 1 5 1100 0 | 17. | 3 5 3450 1 | 6. | 1 6 900 0 | 18. | 3 6 3550 1 | |---| |---| 7. | 2 1 2000 0 | 19. | 4 1 3950 0 | 8. | 2 2 1950 0 | 20. | 4 2 4050 0 | 9. | 2 3 2050 0 | 21. | 4 3 4000 0 | 10. | 2 4 2000 0 | 22. | 4 4 4500 1 | 11. | 2 5 1950 0 | 23. | 4 5 4600 1 | 12. | 2 6 2050 0 | 24. | 4 6 4400 1 | |---| |---|
Data: Wage Premium.dta Do-File: Wage Premium.do
Josef Brüderl, Panel Analysis, April 2015
Is There a Marriage-Premium for Men?
0 1000 2000 3000 4000 5000 EU RO per m onth 1 2 3 4 5 6 Time
before marriage after marriage
Treatment between 3
and 4 (only for the
two high-wage earners) There is a causal effect: a marriage-premium And there is selectivity: Only high-wage men marry
23 Data: Wage Premium.dta Do-File: Wage Premium.do
In these data we have a problem with self-selection:
Married and unmarried men differ in characteristics other than marriage (in these data the assumption of unit homogeneity is invalid)
How High is the Marriage-Premium?
• These are observational (non-experimental) data
– Treatment assignment is not under control of the researcher (no randomization)
- Instead, men can self-select into treatment (marriage)
- Therefore, a between approach will be strongly biased (see below)
• A within approach to compute the marriage-premium
– We have before ( 1, 2, 3) and after ( 4, 5, 6) measurements – This allows for a within approach.
- This “compensates” for the missing randomization (unit heterogeneity will not bias estimation)
- Thus, we can identify the causal effect despite of self-selection
– Because we also have a control group we can use
within estimation with control group (estimation with panel data II)
Josef Brüderl, Panel Analysis, April 2015
How High is the Marriage-Premium?
• DiD is a after-before comparison with control group
– After-before changes (Δ ) treatment group
‐ Δ 4500 4000 500 ‐ Δ 3500 3000 500
– After-before changes (Δ ) control group
‐ Δ 2000 2000 0 ‐ Δ 1000 1000 0
– To get the average treatment effect (ATE) we take the difference of the averages in treatment and control group
Δ ∈ Δ ∈ 500 500
2
0 0
2 500
– The marriage-premium in our data is +500 €
• In the following we will investigate, whether different
statistical regression models can recover this causal effect!
25
Δ 1
3
1 3
• Result of a cross-sectional regression at
4
:
– This is a between-comparison at 4: essentially this compares average wages of married and unmarried men at 4
4500 3500 2
2000 1000
2 2500
– We get a very large marital wage premium
– Obviously this is a
massively biased result!
– The graph shows the information used by the cross-sectional regression 0 1 000 2 000 30 00 40 00 50 00 EU RO p e r m onth 1 2 3 4 5 6 Time
Cross-Sectional Regression
What Is the Problem With Cross-Sectional Regression?
• The most critical assumption of a linear regression
is the exogeneity assumption: E | 0
– I.e., the error term and the regressor must be statistically independent – The exogeneity assumption implies:
‐ E 0 The (unconditional) mean of the error term is 0
‐ Cov , 0 The error term does not correlate with
– The exogeneity assumption guarantees unbiasedness [ ] and consistency [plim ] of the OLS estimator
• Unfortunately, in many non-experimental social science research
settings the exogeneity assumption will be violated
– The error term and the regressor are dependent
E | 0
– Then it is said: the regressor is endogenous (endogeneity)
- The variation that is used to identify the causal effect is endogenous
– will be biased (and inconsistent)
What Is the Problem?
• Where does endogeneity come from?
– There are unobserved confounders
(unobservables that affect both and )
- Then and the error term are correlated - This is called „unobserved heterogeneity“
or „omitted variable bias“
– affects also (reverse causality)
- E.g., high-wage men are selected into marriage,
because the higher wage makes them more attractive marriage partners
• The underlying mechanism:
self-selection
– Treatment and control groups are not built by randomization – Instead, human beings decide according to unobservables or
even the value of , whether they go into treatment or not
• Endogeneity is ubiquitous in non-experimental research
Many (most?) cross-sectional regression results are biased!
Be critical with cross-sectional results. Always ask, whether
wage marriage
intelligence attractivity cosmetic surgery
0 1000 2000 3000 40 00 5000 E U RO per m o nth 1 2 3 4 5 6 Time
before marriage after marriage
Josef Brüderl, Panel Analysis, April 2015
No Solution: Pooled-OLS
• Pool the data and estimate an OLS regression (POLS)
– The result is 1833
– This is the mean of the red points - the mean of the green points – The bias is still heavy
– The reason is that POLS also relies on a between comparison
– Panel data per se do not
help to identify a causal effect! – One has to use appropriate
methods of analysis to make full advantage of panel data
Chapter III:
An Intuitive Introduction to Linear Panel
Regression
Josef Brüderl
Applied Panel Data Analysis
Josef Brüderl, Panel Analysis, April 2015
The Error Decomposition
• To make full advantage of panel data use within estimation
– Within estimators implement a “after-before comparison”
• Starting point: error decomposition
– : person-specific time-constant error term
- Assumption: person-specific random variable
– : time-varying error term (idiosyncratic error term)
- Assumptions: zero mean, homoscedasticity, no autocorrelation
31 wage marriage cosmetic surgery intelligence attractivity
The Error Components Model
• This yields the error components model
– Note that the overall constant has been dropped due to collinearity – Generally, one assumes that the error components are
independent from each other: E | , 0
- We will neglect this subtlety in the following
• POLS is consistent only, if the regressor
is
independent from
both
error components
E | 0 random-effects assumption
“no (person-specific) time-constant unobserved heterogeneity” E | 0 contemporaneous exogeneity assumption “no time-varying unobserved heterogeneity” wage marriage cosmetic surgery intelligence attractivity
Josef Brüderl, Panel Analysis, April 2015
First-Differences Estimator (FD)
• The random-effects assumption is strong
– How can we get rid of it?
• By a differencing transformation we can wipe out the
Subtracting the second equation from the first gives:
Δ
Δ
Δ
where “Δ” denotes the change from 1 to .
– Person-specific errors have been “differenced out”. Time-constant unobserved heterogeneity has been wiped out!
– Pooled-OLS applied to these transformed data provides the first-differences estimator.
• In the psychological literature this model is also called
the „change score“ model
Assumptions of FD Estimation
• The FD-estimator is consistent if
E | 0 for sequential exogeneity assumption
- Intuition: otherwise Δ and Δ would be correlated
– However, because is not in the differenced equation,
E | 0 is no longer required for consistency
- FD identifies the causal effect under weaker assumptions
Time-constant unobserved heterogeneity is allowed
Only time-varying unobserved heterogeneity must not be
wage marriage cosmetic surgery intelligence attractivity Δ wage Δ marriage Δ cosmetic surgery differencing transformation
Josef Brüderl, Panel Analysis, April 2015
Example: FD-Regression
. regress D.(wage marr), noconstant
Source | SS df MS Number of obs = 20 ---+--- F( 1, 19) = 39.97 Model | 405000 1 405000 Prob > F = 0.0000 Residual | 192500 19 10131.5789 R-squared = 0.6778 ---+--- Adj R-squared = 0.6609 Total | 597500 20 29875 Root MSE = 100.66 ---D.wage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---+---marr |
D1. | 450 71.17436 6.32 0.000 301.0304 598.9696
---35 Data: Wage Premium.dta Do-File: Wage Premium.do
• The FD-estimator is 450, which is very close to the true causal effect • The reason for the small bias is that FD compares with the wage
„Mechanics“ of a FD-Regression
• Because there is no constant, the regression line passes the point 0,0 • The slope is based on only 2 observations
– is simply the average wage change before-after marriage
• With 2 FD-estimation is obviously inefficient
-2 00 -1 0 0 0 10 0 20 0 30 0 40 0 50 0 60 0 de lta (w a ge ) 0 .2 .4 .6 .8 1 delta(marr)
Data: Wage Premium.dta Do-File: Wage Premium.do
0 1000 2000 3000 4000 5000 EU RO per m onth 1 2 3 4 5 6 Time
before marriage after marriage
FD-Regression
• This is the information used by a FD-regression
Fixed-Effects Regression (FE)
• Fixed-effects estimation
– Error components model:
– Person-specific means over t:
– “Demeaning” the data (within transformation): (1) – (2)
(3)
– Demeaning wipes out person-specific time-constant unobserved heterogeneity! Only within variation is left.
– Pooled OLS applied to demeaned data provides the fixed-effects estimator
• Note: (2) is called “between regression” (BE)
1
2
Assumptions of FE Estimation
• The FE-estimator is consistent if
E | 0 for all and strict exogeneity assumption
- Intuition: otherwise ̅ and ̅ would be correlated
– However, because is not in the demeaned equation,
E | 0 is no longer required for consistency
- FE identifies the causal effect under weaker assumptions
Time-constant unobserved heterogeneity is allowed
Only time-varying unobserved heterogeneity must not be
Josef Brüderl, Panel Analysis, April 2015 39
wage marriage cosmetic surgery intelligence attractivity wage marriage ̃ cosmetic surgery within transformation
Example: Fixed-Effects Regression
. xtset id time
panel variable: id (strongly balanced) time variable: time, 1 to 6
delta: 1 unit
. xtreg wage marr, fe
Fixed-effects (within) regression Number of obs = 24 Group variable: id Number of groups = 4 R-sq: within = 0.8982 Obs per group: min = 6 between = 0.8351 avg = 6.0 overall = 0.4065 max = 6 F(1,19) = 167.65 corr(u_i, Xb) = 0.5164 Prob > F = 0.0000 ---wage | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---+---marr | 500 38.61642 12.95 0.000 419.1749 580.8251 _cons | 2500 16.7214 149.51 0.000 2465.002 2534.998 ---+---sigma_u | 1290.9944 sigma_e | 66.885605
rho | .99732298 (fraction of variance due to u_i)
Data: Wage Premium.dta Do-File: Wage Premium.do
Interpreting the FE Output
• The FE model succeeds in identifying the true causal effect!
– Marriage increases the wage by 500 €
– The effect is significant (judged by the t-value or the p-value)
– A constant is reported, since Stata adds back the wage mean for
0, which is 2500 here
– Model fit can be judged by the within 2 as usual (referring to (3))
- 90% of the within wage variation is explained by marital status change - The between and overall 2 refer to different models and are not useful
here
– Variance of the error components
- sigma_u is the estimated standard deviation of
- sigma_e is the estimated standard deviation of ̂
• Further details: Andreß et al. (2013: 4.1.2.1)
-400 -300 -200 -100 0 100 200 300 400 de m e an ed (w a g e ) -.5 0 .5 demeaned(marr)
“Mechanics” of a FE-Regression
• Those, never marrying are at 0. They contribute nothing to the regression. • The slope is only determined by the wages of those marrying:
It is the difference in the mean wage before and after marriage.
Data: Wage Premium.dta Do-File: Wage Premium.do
0 1000 2000 3000 4000 5000 EU RO per m onth 1 2 3 4 5 6 Time
before marriage after marriage
FE-Regression
• This is the information used by a FE-regression
Between- and Within-Variation
• To identify the causal effect of
a marriage …
– a between regression (BE) uses the between variation
- This is heavily affected by self-selection of the high-wage men into treatment - The BE marriage premium is estimated
to be 4500 €!
– a within regression (FE) uses only within variation (of the treated only)
- The causal effect is identified by the deviations from the person-specific means
- The “contaminated” (Allison 2009)
between variation is ignored completely - Therefore, self-selection into treatment
does not bias results
0 1000 2000 3000 4000 5000 EU RO p er m on th 1 2 3 4 5 6 Time 1000 2000 3000 4000 5000 E U R O pe r m o nt h
Josef Brüderl, Panel Analysis, April 2015
Equivalent FE-estimator I: LSDV
• Least-squares-dummy-variables-estimator (LSDV) • Practical only when is small
• We get estimates for the
. regress wage marr ibn.id , noconstant
Source | SS df MS Number of obs = 24 ---+--- F( 5, 19) = 9052.94 Model | 202500000 5 40500000 Prob > F = 0.0000 Residual | 85000 19 4473.68421 R-squared = 0.9996 ---+--- Adj R-squared = 0.9995 Total | 202585000 24 8441041.67 Root MSE = 66.886 ---wage | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---+---marr | 500 38.61642 12.95 0.000 419.1749 580.8251 id 1 | 1000 27.30593 36.62 0.000 942.848 1057.152 2 | 2000 27.30593 73.24 0.000 1942.848 2057.152 3 | 3000 33.4428 89.71 0.000 2930.003 3069.997 4 | 4000 33.4428 119.61 0.000 3930.003 4069.997 ---45 Data: Wage Premium.dta Do-File: Wage Premium.do
0 1000 2000 3000 4000 5000 EU R O p e r mo n th 0 1 marriage
Equivalent FE-estimator II: Individual Slope Regression
• Estimate a separate regression for every man marrying
(blue)
– In our data the slopes are equal: +500 for both men
• The FE estimator is the (weighted) mean of the individual slopes
– On average this gives +500, identical to the FE estimator from above
Data: Wage Premium.dta Do-File: Wage Premium.do
Chapter III:
An Intuitive Introduction to Linear Panel
Regression
Josef Brüderl
Applied Panel Data Analysis
0 10 0 0 2 000 30 0 0 40 0 0 5 000 EU R O p e r m o n th 1 2 3 4 5 6 Time
before marriage after marriage
Problem: No Control Group
Modified Data:
“wage3” is a new wage variable
Now, there is a period effect: general wage
increase at 4. In addition, there is no causal effect of a marriage! But there is still selectivity.
Data: Wage Premium.dta Do-File: Wage Premium.do
• So far we used estimation strategies without control group
- This works only, if there is temporal homogeneity (as is the case with our data)
Josef Brüderl, Panel Analysis, April 2015
Problem: No Control Group
• FE-regression yields the wrong answer
– Reason is that FE does not use the control group information
– This is generally true: groups where does not change contribute nothing to the FE-estimator
- Note that Stata reports the in the data, not the used for FE-estimation! . xtreg wage3 marr, fe
Fixed-effects (within) regression Number of obs = 24
Group variable: id Number of groups = 4
R-sq: within = 0.4732 Obs per group: min = 6 between = 0.8000 avg = 6.0 overall = 0.3958 max = 6 F(1,19) = 17.07 Prob > F = 0.0006 ---wage3 | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---+---marr | 500 121.0336 4.13 0.001 246.6738 753.3262 _cons | 2625 52.40907 50.09 0.000 2515.307 2734.693
---+---49 Data: Wage Premium.dta
Solution: Two-way FE-Regression
• Including time fixed-effects (
1
period dummies
)
– Now also the control group information is used for estimating the period effects
. xtreg wage3 marr i.time , fe
---wage3 | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---+---marr | -4.59e-13 58.24824 -0.00 1.000 -124.93 124.93 time 2 | 50 50.44445 0.99 0.338 -58.19259 158.1926 3 | 62.5 50.44445 1.24 0.236 -45.69259 170.6926 4 | 537.5 58.24824 9.23 0.000 412.57 662.43 5 | 562.5 58.24824 9.66 0.000 437.57 687.43 6 | 512.5 58.24824 8.80 0.000 387.57 637.43 | _cons | 2462.5 35.66961 69.04 0.000 2385.996 2539.004
---+---One should always model time in a FE-regression! (via age and/or period effects, see chap. VI)
Data: Wage Premium.dta Do-File: Wage Premium.do
Chapter IV:
The Basics of Linear Panel Regression
Josef Brüderl
Applied Panel Data Analysis
Section: Linear Panel Models
More details on the statistics can be found in Brüderl/Ludwig (2015)
A Note on the Exogeneity Assumptions
• In this section we generalize to a multivariate regression
– Now we have regressors , … , . is the 1 vector (row vector) of the observed covariates of a person.
– is the corresponding 1 vector (column vector) of parameters to be estimated (regression coefficients)
– Now the exogeneity assumptions in conditional mean formulation are
E | 0
E | , 0
- The error terms must be independent from all regressors
- For statistical inference, as well as efficiency properties of estimators these (strong) exogeneity assumptions must hold (Wooldridge, 2010: 288)
– For consistency, however, also weaker assumptions of linear independence suffice
Cov , E
Cov , E
POLS Estimation
• We start from a multivariate error components model
– From the perspective of non-experimental research, the most critical assumptions are exogeneity assumptions on the error terms:
both error terms have to be uncorrelated with the regressors
random-effects assumption
contemporaneous exogeneity assumption
– Contemporaneous exogeneity requires that idiosyncratic errors are not systematically related to the regressors. This assumption is often reasonable.
– The random-effects assumption, however, often will be violated because slow-to-change, hard-to-measure traits that are correlated with the regressors are ubiquitous (Firebaugh et al. 2013)
- E.g., cognitive and non-cognitive ability, genetic disposition, personality, social milieu, peer group characteristics
– If the random-effects assumption fails, estimates of will be biased (omitted variable bias) (unobserved heterogeneity bias)
FE Estimation
• FE wipes out person-specific time-constant unobservables
(fixed-effects) by applying the within transformation
– The model (including time-constant variables )
– “Demeaning” the data (within transformation):
– Now the have gone from the equation, and POLS will provide consistent estimates of , if the assumptions on the next slide hold – Note 1: By the within transformation also all time-constant variables
have been eliminated. With FE it is not possible to estimate their effects .
– Note 2: The term “fixed-effects” comes from the older literature, where the are seen as unit-specific parameters to be estimated for each unit (LSDV). The newer literature sees the also as
FE Assumptions
• No assumption on person-specific time-constant unobserved
heterogeneity is needed
– We no longer need the random-effects assumption
– Instead, the “fixed-effects assumption” allows for arbitrary correlation between and
– The FE estimator is consistent even if
• For consistent estimates we need
, for all , 1, … , strict exogeneity assumption
– Covariates in each time period are uncorrelated with the idiosyncratic error in each time period
– For consistency of estimates strict exogeneity is essential. Therefore, it is very important to discuss the plausibility of this assumption in every FE application [see Chapter X]
• Further FE assumptions (Wooldridge, 2010: 300 ff)
– Full rank of the matrix of the demeaned regressors (no multicollinearity) – Idiosyncratic errors have constant variance across t (homoskedasticity) – Idiosyncratic errors are serially uncorrelated (no autocorrelation)
Other Within Estimators
• Least-squares-dummy-variables (LSDV)
– POLS model, including a dummy for each person – Equivalent with FE
– However, computationally impractical with large
• First-differences (FD)
– FE and FD are equivalent for 2. However for longer panels they will differ. FD is less efficient than FE.
– However, instead of „strict exogeneity“ only „sequential exogeneity“! – Wooldridge (2010: 321ff) gives an extended discussion of the pros
and cons of FD and FE. He finally favors FE (as does the literature).
• Difference-in-differences (DiD)
– DiD implements a before-after comparison with control group
- DiD is intuitively appealing (and for 2 equivalent to FD and FE) - For 2 and with controls DiD differs, however, from FD and FE
Josef Brüderl, Panel Analysis, April 2015
Statistical Inference With Panel Data
• With panel data the idiosyncratic errors are potentially
– Heteroskedastic (i.e., nonconstant variance)
– Autocorrelated (i.e., serial correlation in : 1, … , )
• Ignoring this leads to under-estimated S.E.s
– POLS ignores the panel structure completely
– FE assumes equi-correlated errors over , which is a quite unrealistic error structure
• Solution I: Assume a more realistic error structure
– In Stata with
- xtgee: generalized linear models with unit-specific correlation structure - xtregar: panel regression with first-order autoregressive error term
– Drawback: results heavily depend on the assumptions made – And: xtgee estimates pooled models!
- Thus, the S.E.s might be improved. However, the effect estimates are probably severely biased!
Panel-Robust S.E.s
• Solution II: panel-robust S.E.s
– An extension of the Huber-White sandwich estimator
- They correct for arbitrary serial correlation and heteroskedasticity - For formulas see Brüderl/Ludwig (2015: 334)
- In Stata via vce(cluster id)
– However, panel-robust S.E.s are also biased in finite samples
- Sometimes they are even smaller than conventional S.E.s
- But: “Your standard errors probably won’t be quite right, but they rarely are. Avoid embarrassment by being your own best skeptic, and especially, DON’T PANIC!” (Angrist/Pischke, 2009: 327)
- The major task with non-experimental data is to get the (causal) effect estimates right, a minor task is to get the S.E.s right!
• Solution III: (panel) bootstrap S.E.s
– In Stata via vce(bootstrap)
- Draw many samples over (with replacement) (size ) - Calculate the coefficient estimate
Josef Brüderl, Panel Analysis, April 2015
Random-Effects Estimation
• There is another popular estimation strategy
– Random-effects (RE) estimation
• A multivariate error components model (incl. constant)
– We assume that the are i.i.d. random-effects
- Usually normal distribution is assumed
– For this model we need both exogeneity assumptions
- No time-constant unobserved heterogeneity
random-effects assumption - No time-varying unobserved heterogeneity
, for all , 1, … , strict exogeneity assumption
– Then the pooled feasible generalized least squares (FGLS) estimator is consistent and efficient
- POLS is also consistent but not efficient due to autocorrelated error terms (induced by )
• If the random-effects assumption is violated,
the RE estimates will be biased!
Random-Effects Estimation
• Alternatively RE can be obtained via transformed data
– RE is obtained by applying POLS to the quasi-demeaned data
1 1 1 (4)
– Where 1
– This shows that the RE estimator mixes between and within estimators. The two extreme cases are
‐ 1: FE estimator (e.g., → ∞, large)
‐ 0 : POLS
– RE estimates are between POLS and FE (in the bivariate case)
• As can be seen, with RE we also get estimates of the
effects of time-constant regressors ( )
Josef Brüderl, Panel Analysis, April 2015
Example: Random-Effects Regression
. xtreg wage marr, re theta
Random-effects GLS regression Number of obs = 24 Group variable: id Number of groups = 4 R-sq: within = 0.8982 Obs per group: min = 6 between = 0.8351 avg = 6.0 overall = 0.4065 max = 6 Random effects u_i ~ Gaussian Wald chi2(1) = 128.82 corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000 theta = .96138358
---wage | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---+---marr | 502.9802 44.31525 11.35 0.000 416.1239 589.8365 _cons | 2499.255 406.0315 6.16 0.000 1703.448 3295.062 ---+---sigma_u | 706.57936 sigma_e | 66.885605
rho | .99111885 (fraction of variance due to u_i)
---61 Data: Wage Premium.dta
Interpreting the RE Output
• The RE model does not succeed in identifying the true
causal effect!
– Marriage increases the wage by 503 €
- The reason for the bias is that in our data the random-effects assumption is violated
- However, the bias is low, because 0.96 - This is because is so large in our data
– The effect is significant (judged by the t-value or the p-value) – The “relevant” 2 would refer to (4) and is not estimable
- Model fit can “approximately” be judged by the within 2where the RE-estimates are plugged into (3)
- Approximately 90% of the within wage variation is explained by marital status change
Chapter IV:
The Basics of Linear Panel Regression
Josef Brüderl
Applied Panel Data Analysis
When Does a Within Approach Apply?
• Within estimators cannot estimate
the effects of time-constant variables
– E.g., sex, nationality, social origin, birth cohort, etc.
– This reflects the fact that panel data do not help to identify the causal effect of time-constant variables
- One can even discuss, whether the concept of “causality” makes sense with time-constant variables!
• The "within logic" applies only with time-varying variables
– Something has to “happen”:
Only then a before-after comparison is possible
Analyzing the effects of events
– Such questions are the main strength of panel data and within methodology
The Primary Advantage of Panel Data
• Panel data and within estimation allow to identify causal
effects under weaker assumptions: time-constant
unobserved heterogeneity does not bias estimates
– “In many applications the whole point of using panel data is to allow for to be arbitrarily correlated with the . A fixed
effects analysis achieves this purpose explicitly.” (Wooldridge, 2010: 300)
– “The DiD, fixed effects, and first difference estimators (within estimators) offer researchers the capacity to dispense with the random effects assumption and still obtain unbiased and
consistent estimates when unit effects [ ] are arbitrarily
correlated with measured explanatory variables. This is widely regarded as the primary advantage of panel data.”
(Halaby, 2004: 516)
Sociologists Often Use RE Models
• RE is biased by time-constant unobserved heterogeneity
– Since time-constant unobserved heterogeneity is ubiquitous in non-experimental social research, RE estimates generally will be biased
– So why would anybody want to use RE models?
• Unfortunately, sociologist often use RE models
– Halaby (2004) identifies 31 papers appearing in ASR and AJS between 1990 and 2003 that use panel data for causal analysis. 15 out of these used RE only
– Giesselmann/Windzio (2014) identify 10 papers appearing in ZfS and KZfSS between 2000 and 2009 that use panel data for
causal analysis. 3 out of these used RE only
• Two arguments in favor of RE are often brought forward
– RE allows for estimating effects of time-constant regressors – RE is more efficient than FE, if
Effects of Time-Constant Regressors
• Often it is argued that wiping out all time-constant regressors
is a shortcoming of FE estimation
– However, wiping out all time-constant regressors is not a
shortcoming of FE. In fact it is a major strength, because alongside also all time-constant unobservables are eliminated
• The shortcoming is with a style of data analysis that by
default throws all kinds of controls into a regression
(“kitchen-sink-approach”)
– Using the RE estimator only to report effects of sex, race, etc. is risking to throw away the big advantage of panel data
– Instead of a thoughtless kitchen-sink-approach we should carefully think about the identification of a single causal effect (X centering)
• If one has substantive interest in the effect of a
time-constant regressor, one should use group specific growth
curves, instead of a simple RE model (see Chapter VI)
0 .1 .2 .3 .4 .5 .6 .7 .8 de ns it y -4 -3 -2 -1 0 1 2 3 4 FE estimate RE estimate
Bias versus Efficiency
• Sometimes it is argued, that a biased but efficient estimate (RE) might be preferable to an unbiased but less efficient estimate (FE)
– This argument is only sound, if the bias is small
– However, RE is more efficient because it also uses the endogenous between variation. Generally, this is not a good idea, because this will produce a large bias.
“true” value: β = 0
A thought experiment to illustrate the point
| (b) (B) (b-B) sqrt(diag(V_b-V_B)) | FE RE Difference S.E.
---+---marr | 500 502.9802 -2.980234 1.210069
---b = consistent under Ho and Ha; o---btained from xtreg
B = inconsistent under Ha, efficient under Ho; obtained from xtreg Test: Ho: difference in coefficients not systematic
chi2(1) = (b-B)'[(V_b-V_B)^(-1)](b-B) = 6.07 Prob>chi2 = 0.0138
Josef Brüderl, Panel Analysis, April 2015
Testing whether RE or FE: Hausman Test
• Use RE models only, if a Hausman test says „it is ok“
– The intuition: the FE estimates ( ) are consistent; If the RE
estimates ( ) do not differ too much, one can use RE regression
H ∶ ~
– If you are not able to reject H0, then you can use RE
69 Data: Wage Premium.dta
Chapter V:
A Real Data Example:
Marriage and Happiness
Josef Brüderl
Applied Panel Data Analysis
Josef Brüderl, Panel Analysis, April 2015
Example: Does Marriage Make Happy?
• In the following we will use a real data example
• The goal is to estimate the causal effect of (first) marriage
on happiness
– More exactly: life satisfaction (or: subjective well-being)
• Data: SOEP 1984-2009 (v26)
– The data set contains repeated measures for the same persons on the following variables:
- Life satisfaction, marriage, years married, household income, age, sex, year of interview
marriage
marriage
Panel
The Problem again is Self-Selection:
Happy People are More Likely to Get Married
See also: Stutzer/Frey (2005)
1 2 3 4 5 6 7 1 2 3 year ha ppine s s Cross-Section
Josef Brüderl, Panel Analysis, April 2015
Preparing the Data for Panel Analysis
• Retrieving the data:
“Happiness 1 Retrieval.do”
- First “happy” is retrieved from $P, GPOST, and $PAGE17 - The covariates needed are retrieved from $PEQUIV
• Preparing the data:
“Happiness 2 DataPrep.do”
- Variables are recoded and time-varying covariate “marry” is built - The estimation sample is selected
• Analyzing the data:
“Happiness 3/4/5 Regressions.do”
- Data file: “Happiness2.dta”
- The lecture-package includes an anonymized version of these data
(50% sample). Therefore, results are very similar, but not identical to the ones reported in the lecture.
73
marry: age: loghhinc:
marriage dummy age in years natural logarithm of
annual post-government household income
yrsmarried: woman:
Defining the Estimation Sample
• How to define the estimation sample?
– Practically very important, but (almost) nothing can be found in literature – One should include only those, who potentially can change from
the state of not-treated to treated (Sobel 2012)
- Only those persons, who experience treatment during the observation period provide within information and identify the treatment effect
- Include also the never-treated persons as a control group
- The already-treated might bias the estimation of the treatment effect
- The already-treated might improve the precision of the estimated age effect. However, if the treatment effect varies over time, then the age effect of the already-treated might be distorted.
• For our example, we restrict the estimation sample accordingly
– Only persons are included, who were single when first observed (persons married when first observed, are discarded altogether!) – Person-years after marital separation are excluded
– Persons with only one person-year are excluded
• Deleting so many observations differs markedly from what one is
used from cross-sectional data analysis!
Time-Ordering of Events
• Causality requires that the cause precedes the effect
– Panel data help to identify the time-ordering of treatment and outcome
– This has to be taken into regard when preparing the data
• Example: binary treatment (absorbing)
– An event happens between wave 1 and
, … , 0 , … , 1
– Outcomes have to be measured accordingly
, … , measured before event
, … , measured after event
• Happiness example
– All variables measured at time of interview. Therefore, no problem
‐ 0, 1 if there was a marriage between 1 and
‐ is measured then before, and after marriage
How to Model a Causal Effect?
• With panel data we can investigate the time path of a causal effect • Termed “impact function” (IF) by Andreß et al. (2013)
• Different impact functions can be modeled
Step impact function
Immediate and permanent impact
● event dummy (0,0,0,0,1,1,1) Y
T Event
Continuous impact function
Immediate, but transitory impact
● event dummy (0,0,0,0,1,1,1)
● linear event time (0,0,0,0,0,1,2)
● quadratic event time (0,0,0,0,0,1,4) Y
T Event
Dummy impact function
Arbitrary impact (including anticipation effect)
● dummy event time
− -1 dummy (0,0,0,1,0,0,0) − 0 dummy (0,0,0,0,1,0,0) − 1 dummy (0,0,0,0,0,1,0) − 2 dummy (0,0,0,0,0,0,1) Y T Event
Presenting Regression Results Graphically
• Instead of regression tables with many numbers, it is helpful
to present the important results graphically (Bauer, 2015)
• Basically there are three types of graphs available
– Plotting average marginal effects (AME) (effect plot)
- How, does a one unit change in X affect Y? (marginal effect for continuous variables, discrete change for categorical variables)
- Computed for each observation in the data given their respective values on other variables. Then averaged over all observations. - For linear models these are simply the regression coefficients
– Plotting the predicted values (profile plot)
- What are the predicted values for Y given the values of X?
- Computed for each observation in the data and then averaged (predictive margins)
– Plotting AMEs of X conditional on values of Z (conditional effect plot)
- How changes the AME of X over the values of Z? - Helpful for interaction effects and for impact functions
Chapter V:
A Real Data Example:
Marriage and Happiness
Josef Brüderl
Applied Panel Data Analysis
Josef Brüderl, Panel Analysis, April 2015
Describing Panel Data I
. xtset id year
panel variable: id (unbalanced)
time variable: year, 1984 to 2009, but with gaps delta: 1 unit
. xtdes, pattern(20)
id: 103, 202, ..., 8276802 n = 14634
year: 1984, 1985, ..., 2009 T = 26
Distribution of T_i: min 5% 25% 50% 75% 95% max 2 2 4 7 11 22 26 Freq. Percent Cum. | Pattern
---+---808 5.52 5.52 | ...1111111111 Sample F (2000) 551 3.77 9.29 | ...1111 Sample H (2006) 390 2.67 11.95 | ...111 348 2.38 14.33 | 11111111111111111111111111 Sample A/B (1984) 347 2.37 16.70 | ...11111111 Sample G (2002) 337 2.30 19.00 | ...11 302 2.06 21.07 | ...11111 271 1.85 22.92 | ...111111 250 1.71 24.63 | ...111111111111111111
235 1.61 29.61 | 11... Attrition sample A/B 222 1.52 31.13 | 111... Attrition sample A/B 141 0.96 36.96 | 1111... Attrition sample A/B 8981 61.37 100.00 | (other patterns)
---+---14634 100.00 | XXXXXXXXXXXXXXXXXXXXXXXXXX
79 Data: Happiness2.dta
Describing Panel Data II
. xttrans marry , freq
| marry (t+1) marry (t) | 0 1 | Total ---+---+---0 | 79,2---+---+---09 3,793 | 83,002 | 95.43 4.57 | 100.00 ---+---+---1 | 0 24,283 | 24,283 | 0.00 100.00 | 100.00 ---+---+---Total | 79,209 28,076 | 107,285 . xtsum marry age loghhinc woman
Variable | Mean Std. Dev. Min Max | Observations ---+---+---marry overall | .230284 .4210163 0 1 | N = 121919 between | .2694525 0 .9615385 | n = 14634 within | .2671083 -.7312544 1.191823 | T-bar = 8.33121 | | age overall | 29.27678 11.12508 16 97 | N = 121919 between | 10.47358 16.5 96.5 | n = 14634 within | 4.310836 13.90178 45.19345 | T-bar = 8.33121 | | loghhinc overall | 10.21187 .6562057 0 14.16139 | N = 121919 between | .5664285 6.034016 13.2728 | n = 14634 within | .4252418 2.134135 12.9504 | T-bar = 8.33121 | | woman overall | .467179 .4989237 0 1 | N = 121919 between | .4992656 0 1 | n = 14634
Is there enough within variation?
Data: Happiness2.dta
Chapter V:
Josef Brüderl
Applied Panel Data Analysis
Section: The Results
- Panel-robust S.E.s - Step impact function
- Continuous impact function - Dummy impact function
Panel-Robust S.E.s
---| FE FE FE S.E.| conventional panel-robust bootstrap ---+---marry | 0.1668 0.1668 0.1668 | 0.0168 0.0226 0.0189 | 9.9503 7.3692 8.8254 age | -0.0413 -0.0413 -0.0413 | 0.0010 0.0017 0.0017 | -39.3197 -23.9552 -23.6374 loghhinc | 0.1245 0.1245 0.1245 | 0.0093 0.0123 0.0128 | 13.4180 10.1600 9.7087 ---legend: b/se/t Conventional S.E.s are too small.
Panel-robust S.E.s are close to the bootstrap S.E.s.
Obviously, with over 14,000 clusters asymptotics works well.
In the following we will always use panel-robust S.E.s!
Data: Happiness2.dta
Josef Brüderl, Panel Analysis, April 2015
Comparing Results From Step IF
---Variable | BE POLS RE FE FD
---+---marry | 0.52*** 0.34*** 0.20*** 0.17*** 0.14*** loghhinc | 0.50*** 0.38*** 0.20*** 0.12*** 0.05**
woman | 0.03 0.03 0.03 (omitted) (omitted)
---+---N | 14634 121919 121919 121919 104671 N_clust | 14634 14634 14634 14511
---legend: * p<0.05; ** p<0.01; *** p<0.001
Marriage effect: Heavily biased upwards by BE and POLS
RE is still biased upwards (median theta is 0.58) FD too low due to anticipation (see below)
Income effect: Heavily biased upwards by BE, POLS, and RE; unobservables affecting income and happiness: happy people earn more money Sex effect: Time-constant variable not estimable with FE and FD
83
Data: Happiness2.dta
Do-File: Happiness 3 Regressions.do
FD: Due to gaps in the data we loose some groups (clusters) and observations.
All models control for age and cohort (see Chapter VI for details)
marriage
ln(HHincome)
woman
-.1 0 .1 .2 .3 .4
Effect on happiness
Regression Coefficients with 95% CIs
Comparing Estimation Results Graphically
• Comparing regression coefficients across models is much more effective, if done graphically
– The “coefplot” package (Jann 2014) is here very helpful in this respect
ssc install coefplot, replace // Install "coefplot" package (Jann 2014) coefplot POLS RE FE, keep(marry loghhinc woman) xline(0)
Why Do We Need a Control Group?
• Estimation sample: (1) only those who married, (2) plus control group
– Marriage effect is not affected, because the control group contributes nothing to the FE estimate – S.E.s differ al little, because the d.f.s are different
• (3) and (4) include age as control
– The marriage effect in (1) and (2) is obviously heavily biased
- The reason is that with increasing age there is a happiness decline (more details in Chap. VI)
- Lesson 1: It is important to control for time-varying confounders in FE models
– In (3) the age effect is estimated only with those who married. It is too low, as can be seen in the full sample (4) including the control group
- This also affects the marriage effect, that is too low in (3)
- Lesson 2: We see that it is important to include a control group to get the estimates of the control variables right
Josef Brüderl, Panel Analysis, April 2015 85
---FE estimates | (1) (2) (3) (4) ---+---marry | -0.139 -0.139 0.136 0.186 | 0.0139 0.0144 0.0186 0.0167 age | -0.033 -0.039 | 0.0015 0.0010 ---+---N | 49235 121919 49235 121919 N_clust | 3793 14634 3793 14634 ---legend: b/se
Deciding Between FE and RE: Hausman Test
. xtreg happy marry age loghhinc i.cohort, re . est store RE
. xtreg happy marry age loghhinc, fe . est store FE
. hausman FE RE, sigmamore
Coefficients ----| (b) (B) (b-B) sqrt(diag(V_b-V_B)) | FE RE Difference S.E. ---+---marry | .1667519 .1962653 -.0295134 .0068511 age | -.0412592 -.040121 -.0011383 .0004667 loghhinc | .1245142 .1956244 -.0711102 .0039792 ---b = consistent under Ho and Ha; o---btained from xtreg B = inconsistent under Ha, efficient under Ho; obtained from xtreg Test: Ho: difference in coefficients not systematic
chi2(3) = (b-B)'[(V_b-V_B)^(-1)](b-B) = 544.31
Prob>chi2 = 0.0000
use the FE model Data: Happiness2.dta
Josef Brüderl, Panel Analysis, April 2015
Results of a FE Model with Continuous IF
. xtreg happy i.marry c.yrsmarried##c.yrsmarried age loghhinc, fe vce(cluster id) Fixed-effects (within) regression Number of obs = 121919
Group variable: id Number of groups = 14634
R-sq: within = 0.0162 Obs per group: min = 2 between = 0.0221 avg = 8.3 overall = 0.0152 max = 26 F(5,14633) = 144.05 corr(u_i, Xb) = -0.1808 Prob > F = 0.0000
(Std. Err. adjusted for 14634 clusters in id)
---| Robust
happy | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---+---marry | 0.2413 0.0231 10.47 0.000 0.1961 0.2865 yrsmarried | -0.0411 0.0065 -6.29 0.000 -0.0539 -0.0283 c.yrsmarried#c.yrsmarried | 0.0017 0.0004 4.71 0.000 0.0010 0.0024 age | -0.0372 0.0021 -17.33 0.000 -0.0415 -0.0330 loghhinc | 0.1306 0.0124 10.57 0.000 0.1064 0.1549 _cons | 6.8787 0.1360 50.57 0.000 6.6121 7.1453 ---+---sigma_u | 1.2861966 sigma_e | 1.3325279
rho | .48231322 (fraction of variance due to u_i)
---87 Data: Happiness2.dta
marriage yrsmarried yrsmarried # yrsmarried age ln(HHincome) -0.10 -0.05 0.00 0.05 0.10 0.15 0.20 0.25 0.30
Regression Coefficients with 95% CIs
Coefficient Plot
coefplot, drop(_cons) xline(0)-0.3 -0.2 -0.1 0.0 0.1 0.2 0.3 0.4 0.5 C h ange in h appiness 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Years since marriage
95%-CI
Josef Brüderl, Panel Analysis, April 2015
Impact Function of Marriage (Conditional Effect Plot)
89
• Of central interest: time path of the marginal marriage effect
– Change in happiness due to a marriage ( ) over yrsmarried ( )
∗ ∗
What is the
reference point?
• Average happiness of all pyrs before marriage
• Important point: only of those, who eventually marry. Not of the always singles
• After all this is a within estimator!
-0.3 -0.2 -0.1 0.0 0.1 0.2 0.3 0.4 0.5 C ha n ge i n happ ines s 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Years since marriage
POLS RE FE
Conditional effect of marriage estimated by POLS, RE, and FE
Comparing Models
• Comparing the conditional marriage effect over models
– We see that POLS heavily over-estimates the marriage effect
- Due to self-selection
– RE slightly over-estimates the marriage effect
Dummy Impact Function
• A flexible way to model the causal effect is by event time dummies
– For this, we have to construct an “event centered” time scale (ym)
-1 all years before marriage (ref. group) 0 the year of marriage
1 first year after marriage …
15 15th+ year after marriage
- Be careful in 0 year: event must have happened before outcome is measured!
- We collapse the dummies 15 - max due to low case numbers
– The event time dummies are easily included in a regression model via factor notation (i.ym) [0-dummy, 1-dummy, …, 15-dummy]
– Interpretation: the within estimator compares average happiness in a particular year with average happiness in all (!) years before marriage – This model is known in the literature as distributed fixed-effects
(Dougherty 2006)
- It can also be estimated by including lags and leads of the 0-dummy (this is often done by economists, e.g. Wooldridge 2010: chap. 10)
-0.3 -0.2 -0.1 0.0 0.1 0.2 0.3 0.4 0.5 C h an ge i n h a ppi n es s 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Years since marriage
Dummy Impact Function
• The dummy modelling in general supports the results from the parametric modelling above
– However, we see more details: Compared to the years before marriage
- Happiness increases by 0.31 in the year of marriage
- In the first year after marriage, happiness is only higher by 0.17
- Beginning with the fifth year, happiness is no longer significantly higher
Data: Happiness2.dta
0 1000 2000 3000 40 00 5000 E U RO per m o nth 1 2 3 4 5 6 Time
How to Model a Causal Effect?
• Above I argued that one should use
– Either a step event dummy: 0 before, 1 after the event – Or a full set of event time dummies
• Some argue, to use the 0-dummy only
– The intuitive idea is that this variable captures the change event – However, this modeling strategy would make only sense, if the
effect is immediate and very short-lived – Example: With our fabricated
data the within estimator compares red circled minus green circled points
- The result is +300 €
- Obviously, this is biased, because the causal effect persists and is therefore in the reference group
Glücksfaktoren
Source: Deutsche Post,
• FE models
• SOEP, 1992-2010
• The most important factor is good/bad health
• Life-events (marriage,
widowhood, unemployment, divorce) are next
• Here modeled as permanent
effects
• Social contacts/isolation comes next
• Age is here mis-specified (see next chapter)
• Money has relatively small effects
Chapter V
Josef Brüderl
Applied Panel Data Analysis
Section: Interpreting Results from Panel
Regressions
– Interpreting panel regression estimates – Interpreting results from impact functions – Interpreting effects of continuous variables – Interpreting interaction effects
Interpreting Panel Regression Estimates
• There is much confusion in the literature on how to
interpret results from panel regressions
• Regression estimates can be interpreted in two ways
– I) Descriptive interpretation
- “People who differ in X by one unit, differ in Y by ”
– II) Causal interpretation
- “A one unit change in X, changes Y by ” - Sometimes called “change interpretation”
• Cross-sectional (between) regression
– It would be natural to choose interpretation I).
After all that is the information provided by the data! – However, often interpretation II) is chosen.
But this is only ok, if the exogeneity assumptions hold
• Within regression
– It is natural to choose interpretation II). Because within estimates are obtained by a before-after comparison.
Interpreting Panel Regression Estimates
Josef Brüderl, Panel Analysis, April 2015 97
0 1000 2000 3 000 4000 5000 EU RO per m onth 1 2 3 4 5 6 Time
before marriage after marriage
Cross-sectional regression:
Married men earn 2500 € more than unmarried men (descriptive interpretation)
A marriage increases men’s wage by 2500 € (causal interpretation)
POLS regression:
Men living in marriage earn 1833 € more than men living unmarried (descriptive interpretation)
A marriage increases men’s wage by 1833 € (causal interpretation) 0 1000 2000 3 000 4000 5000 EU RO per m onth 1 2 3 4 5 6 Time
before marriage after marriage
0 1000 2000 3 000 4000 5000 EU RO per m onth 1 2 3 4 5 6 Time
before marriage after marriage
Fixed-effects regression:
After marriage men earn 500 € more than before marriage (descriptive interpretation)
A marriage increases men’s wage by 500 € (causal interpretation)
The descriptive interpretation is always correct. The causal interpretation is only correct,
Confusion about Interpretation
• Some authors provide confused arguments
– Andreß et al. (2013) argue that
- I) is appropriate for POLS and FE (modeling the level) and that - II) only works for FD (modeling the change)
- However, all models can be interpreted in levels (descriptive) or change (causal). But only FD and FE (!) use the change information contained in panel data to answer the causal question.
– Giesselmann/Windzio (2012) term
- I) “cross-sectional questions” - II) “longitudinal questions”
- However, both descriptive and causal questions can be answered with either cross-secti