Econometrics
Fall 2012
Final Exam
Statement of Academic Honesty:
This exam entirely reflects my own work. I have not given assistance to anyone, nor have I received assistance from anyone. I am not aware that any other students have done so.
Signature: __________________________________________________
Name: __________________________________________________
Problem 1 (20 points)
Answer the following short-answer questions about research design.
a. You intend to study whether participation in a new exercise program has an effect on future cardiovascular health. You conduct an experiment among a group of people who volunteered for the study, randomly assigning half of them to the new exercise program and half of them to a conventional aerobics class. You are interested in whether the new program produces better outcomes than a conventional class. The problem is that the volunteers for your study are all younger and relatively healthy. Describe the impact that this will have on the internal validity and on the external validity of your study.
c. You find a large number of volunteers and randomly assign half of them to a new job training program. Letting 𝑋 = 1 if the person was enrolled in the training program and letting 𝑌 designate the wage, you estimate the regression 𝑌 = 𝛽0+ 𝛽1𝑋 + 𝑢 to determine whether the job training program has any impact on wage. A friend claims that marital status is known to impact wage and so it needs to be included in the regression. How would you respond to your friend’s criticism?
Problem 2 (10 points)
Each member of the population belongs to exactly one of three categories. Define the following dummy variables.
𝑋1 = 1 if a member of the first group, zero otherwise
𝑋2 = 1 if a member of the second group, zero otherwise
𝑋3 = 1 if a member of the third group, zero otherwise
𝑌 measures some outcome variable. You run an OLS regression, taking 𝑋3 as the omitted dummy, and you obtain the following.
𝑌̂ = 1 + 3𝑋1+ 11𝑋2
a. Suppose you had instead taken 𝑋1 to be the omitted dummy. Fill in the estimated regression coefficients. The coefficients could be negative or positive.
𝑌̂ = _____ + _____𝑋2+ _____𝑋3
b. Suppose you had left in all three dummies but omitted the intercept. Fill in the estimated regression coefficients. The coefficients could be negative or positive.
𝑌̂ = _____𝑋1+ _____𝑋2+ _____𝑋3
c. You have a member of the first group and a member of the second group. Which do you expect has a higher value of 𝑌? What is the expected difference?
Problem 3 (10 points)
You have data on 𝑌1, 𝑌2 and 𝑋 from a random sample of individuals. Consider the two regressions:
𝑌1 = 𝛼0+ 𝛼1𝑋 + 𝑢1 𝑌2 = 𝛽0+ 𝛽1𝑋 + 𝑢2
Both are valid regressions in the sense that 𝐸(𝑢1|𝑋) = 0 and 𝐸(𝑢2|𝑋) = 0.
Problem 4 (15 points)
A researcher is interested in the determinants of crime. He collects data for a large number of cities on the following variables.
Variable Description
PROPCRIM Rate of property crimes VIOLCRIM Rate of violent crimes
POLICE Expenditure on police
MEDH Median home price
POPD Population density
UNEM Unemployment rate
DEATH Dummy, =1 if state uses death
penalty for violent crimes
MEDAGE Median age
Assume that the following is the true model that describes the relationship:
𝑃𝑅𝑂𝑃𝐶𝑅𝐼𝑀 = 𝛼0+ 𝛼1⋅ 𝑃𝑂𝐿𝐼𝐶𝐸 + 𝛼2⋅ 𝑀𝐸𝐷𝐻 + 𝛼3⋅ 𝑃𝑂𝑃𝐷 + 𝛼4⋅ 𝑈𝑁𝐸𝑀 + 𝑢 (1)
𝑉𝐼𝑂𝐿𝐶𝑅𝐼𝑀 = 𝛽0+ 𝛽1⋅ 𝑃𝑂𝐿𝐼𝐶𝐸 + 𝛽2⋅ 𝐷𝐸𝐴𝑇𝐻 + 𝛽3⋅ 𝑀𝐸𝐷𝐴𝐺𝐸 + 𝑣 (2)
𝑃𝑂𝐿𝐼𝐶𝐸 = 𝛾0+ 𝛾1⋅ 𝑉𝐼𝑂𝐿𝐶𝑅𝐼𝑀 + 𝛾2⋅ 𝑃𝑅𝑂𝑃𝐶𝑅𝐼𝑀 + 𝑤 (3)
a. If the coefficients are estimated with proper instrumental variables, what signs do you expect for each coefficient? Give a brief explanation.
Coefficient Expected
Sign Reasoning
𝛼1
𝛼2
𝛼3
𝛼4
𝛽1
𝛽2
𝛽3
𝛾1
𝛾2
b. Consider using TSLS to estimate each equation. For each of the three equations, state which independent variable(s) is/are endogenous and which instrument(s) you would use for each endogenous variable. You do not have any data beyond what is listed in the table; you can complete the estimation with these variables only.
EQUATION (1):
EQUATION (2):
Problem 5 (25 points)
Some jobs pay a flat salary and some jobs pay employees based on performance. An economist is interested in how wages differ depending on the compensation structure. The economist collects data on a panel of more than 26,000 workers and estimates a regression of log(𝑤𝑎𝑔𝑒). Part of the regression output is given below. The researcher estimates both a regression with no fixed effects, and then estimates a regression with fixed effects for each individual.
No Fixed Effects Fixed Effects
PERFORM -0.4526
(0.1019)
-0.2061 (0.0723)
EDUC 0.0637
(0.0040)
0.0167 (0.0091)
EDUC*PERFORM 0.0365
(0.0071)
0.0169 (0.0048)
EXPER 0.3010
(0.0294)
0.4545 (0.1258)
EXPER*PERFORM 0.1162
(0.0584)
0.0149 (0.0501)
TENURE 0.2262
(0.0154)
0.1158 (0.0129)
TENURE*PERFORM -0.0666
(0.0301)
0.0278 (0.0237)
The variables are defined as follows:
PERFORM is a dummy equal to 1 if the job is a performance-pay job
EDUC is years of education
EXPER is a dummy equal to 1 if the individual has more than 20 years job experience
TENURE is a dummy equal to 1 if the individual has more than 10 years tenure at his current job
a. For the regression with no fixed effects, which coefficients are significantly different from zero? Explain.
b. Using the regression with no fixed effects, are the returns to education higher for a job with performance pay or for a job without performance pay?1 State and interpret the difference precisely (i.e. give a specific number and interpret it).
c. Using the regression with no fixed effects, what is the expected wage increase that results from a performance-pay structure for a worker with 16 years of education, in addition to more than 20 years of job experience and more than 10 years tenure at his current job?
d. The effect in (c) is quite large. How would economics explain this effect?
Problem 6 (10 points)
You collect data on a large number of towns in the United States. In this problem, you will work with the two regressions given below.
𝐺 = 𝛼0 + 𝛼1𝑃 + 𝑢
𝐺 = 𝛽0+ 𝛽1𝑃 + 𝛽2𝐼 + 𝑢
In these equations
G is the per capita gasoline consumption of town residents
P is the average price of gasoline in the town
I is average annual income in the town
You can assume that price changes in this model come from supply changes, i.e. the above equations can be interpreted as demand functions.
a. How do you think 𝛼1 and 𝛽1 compare (in absolute value)? Explain your reasoning.
Problem 7 (10 points)
A firm is curious about the price that it should charge in order to maximize its profit. It collects a large number of observations on price charged 𝑃 and profit Π, in addition to other relevant control variables. Consider two alternative specifications:
Quadratic: Π = 𝛽0+ 𝛽1𝑃 + 𝛽2𝑃2+ ⋯ + 𝑢
Log-log: ln(Π) = 𝛽0+ 𝛽1ln(𝑃) + ⋯ + 𝑢
a. How would you go about computing the price that maximizes profit for each of the two specifications above? Be specific.
Bonus Problem (+5 extra credit possible)