VII. Examples of Multiple Regression
a. Omitted Variables and Fixed Effects – price elasticity
regressions
a.Omitted Variables and Fixed Effects
Let’s return to the log-log price elasticity regressions.
a.Omitted Variables and Fixed Effects
In the above regression, we are “pooling” across all of the 86 stores and dumping them into the same regression with 14745 observations. This sounds great. We obtain dramatically greater precision.
However, the set of stores could be different. In particular, “large” stores are pooled with “small” stores. We measure store size with ACV – All
Commodity Volume which is the total dollars sales for all product.
a.Omitted Variables and Fixed Effects
Only if price and store size are correlated, e.g. price is lower/higher in larger/smaller stores.
15000
20000
25000
a.Omitted Variables and Fixed Effects
Note: acv matters (it should!) but does not
necessarily have to be included in the equation.
a.Omitted Variables and Fixed Effects
Some would argue that price promotions create a
response that is different than a change the “regular” or “shelf” price. That is, consumers “stock up” on
sale and that much of this increase in sales is simply future purchases made at the time of the sale –
forward buying.
a.Omitted Variables and Fixed Effects
a.Omitted Variables and Fixed Effects
But size is not the only characteristic of stores. It could be that prices are systematically higher/lower in stores with a greater total demand for Tide 128oz. For example, stores in the suburbs are often facing stiff competition from Target/Walmart and they also have high demand for the larger size of Tide.
Dominick’s uses a set of price zones and in those zones with warehouse format competition, the prices of many items are lower.
a. Omitted Variables and Fixed Effects
We could add dummy variables for each store (well 85 stores). As we saw before, this can be done
automatically for any categorical variable as long as it is what R calls a “factor.”
This is called a “fixed” effects regression approach. The idea is that anything that is store specific would be captured in these store coefficients including a relationship between prices and demand.
Note: if you want to make a standard numerical variable into a factor (we don’t have to do this because store is already a factor).
a.Omitted Variables and Fixed Effects
1.2 1.4 1.6 1.8 2.0 2.2
0 1 2 3 4 lnp lnq Two groups of stores:
1. High
price, low demand
2. Low
price, high demand
a. Omitted Variables and Fixed Effects
What might you expect? Higher prices with higher
demand! This would tend to bias price elasticity to small numbers
0.0
0.5
1.0
1.5
2.0
b. Model Selection: Equity Premium Puzzle
There is a very large literature in finance that addresses the following question:
Why is there such a huge premium on equities versus so-called “risk-free” assets like Treasury Bills?
Let’s look at some data from Welch and Goval, “A Comprehensive Look at The Empirical Performance of Equity Premium Prediction,” RFS(2008). Data
updated to 2009.
b. Model Selection: Equity Premium Puzzle
Let’s look only at data from 1927 to 2009.
Puzzle: Is the premium of 7.63% per year worth the
b. Model Selection: Equity Premium Puzzle
Wow! Look at how risky the market is! Histogram of CRSP_SPvw
CRSP_SPvw F re qu en cy
-0.3 -0.1 0.1 0.3
0
50
100
150
200
Histogram of Rfree
Rfree F re qu en cy
0.000 0.004 0.008 0.012
0
50
100
150
b. Model Selection: Equity Premium Puzzle
b. Model Selection: Equity Premium Puzzle
Histogram of termvw - termRfree
termvw - termRfree
F
re
qu
en
cy
0 50 100 150
0
50
100
150
trimmed < 20
termvw - termRfree
F
re
qu
en
cy
0 5 10 15 20
0
50
100
b. Model Selection: Equity Premium Puzzle
Often investment in market out-performs the T bill, but not always. What is the probability that T bill portfolio will outperform?
b. Model Selection: Equity Premium Puzzle
The finance literature has tried to “explain” or predict the equity premium, using regressions of the form:
What sort of variables have been used to predict?
1. Book to Market ratio
2. Various combinations of dividends, earnings, price as ratios
3. Bond returns and various term premia
4. Stock Issuing activity
5. Various macro variables relating to investment and
consumption (hold your nose!)
b. Model Selection: Equity Premium Puzzle
How should we compare models?
Another way of interpreting R2 is as a measure of what
is called “in-sample” prediction. MSE = mean squared error.
Mean: use average value of y to predict
R
2=
1
−
SSE
SST
=
1
−
y
t−
ˆy
t(
)
2t=2 T
∑
y
t−
y
(
)
2t=2 T
∑
=
1
−
MSE
regressionb. Model Selection: Equity Premium Puzzle
b. Model Selection: Equity Premium Puzzle
Now run the regressions and summarize R-sq and F stat (same as t with only one indep var):
b. Model Selection: Equity Premium Puzzle
But these variables could be inter-correlated:
Looks like the very predictors that look
promising are pretty highly intercorrelated.
b. Model Selection: Equity Premium Puzzle
Now, let’s delete
b. Model Selection: Equity Premium Puzzle
b. Model Selection: Equity Premium Puzzle
Check ACF of residuals
0.0
0.2
0.4
0.6
0.8
1.0
AC
F
Glossary of R Commands
• which(x==0): find the indices in x (rows) for which x
takes on value of 0. “==“ means “equal to.”
• as.factor(x): creates a factor from variable x.
• Name[-2,]: the second row is deleted from a matrix or