Chapter IV. Slide 1
IV. Prediction and Diagnostics
a. Prediction
b. Why Regression Diagnostics?
c. Residuals Plots
a. Prediction
Model:
The
conditional forecasting problem
can be succinctly stated
as:
–
Predict a “future” observation, y
f–
Given X
fand the sample data {X
i, Y
i} i = 1, …, N
The only practical solution to the prediction problem is to use
estimated parameters:
Y
i
β
0
β
1X
i
ε
ii 1,
K
,
N
ε
i~iid
N
0,σ
2Chapter IV. Slide 3
a. Prediction
If we use this predictor, we will make a prediction error:
e
f
Y
f− ˆ
Y
f Y
f−
b
0−
b
1X
fLet’s draw this:
E[Yf|Xf ] = β0 + β1 X
b0 + b1 X
X
fY
fSampling error
e
ff
ˆY
f
ˆY E Y | X
f
f f
a. Prediction
Let’s write our prediction error in such a way so that we can see the influence of two factors:
i. the model error term or the inherent randomness ii. estimation error in the model parameters
Y
f− ˆ
Y
f
f Y
f−
E Y
⎡⎣
f| X
f⎤⎦
− ˆ
(
Y
f−
E Y
⎡⎣
f| X
f⎤⎦
)
(
)
b b 0 1Xf b bf 0 1Xf Yˆf E Y | Xf f
(
) (
)
0 0 1 1 f
f
b
β
b
β
X
ε
Chapter IV. Slide 5
a. Prediction
Now let’s compute a prediction interval for Y
fThe predictive
standard error
, denoted s
pred, is then
s
pred s 1
1
N
X
f−
X
(
)
2N
−
1
(
)s
X2⎛
⎝
⎜
⎜
⎞
⎠
⎟
⎟
.5Standard Error of the Regression
Var e
(
f Y
f− ˆ
Y
f) V ar
( ) V ar
ε
f( )
Y
ˆ
f
σ
2
σ
21
N
X
f−
X
(
)
2N
−
1
(
)s
2X⎛
⎝
⎜
⎜
⎞
⎠
⎟
⎟
σ
2
1
1
N
X
f−
X
(
)
2N
−
1
(
)s
X2a. Prediction
Let’s return to the printout and fill-in the formula for the prediction
interval
(
)
(
)
1/2 2 f * *0 1 f N 2, /2 2 0 1 f N 2, /2 pred
X
X
X
1
b
b X
t
s 1
b
b X
t
s
N
N 1 s
a a
Chapter IV. Slide 7
b. Why Regression Diagnostics?
Up to now, we have assumed that the data are generated by a
linear regression model
What are the basic assumptions of the model?
1. linear conditional mean
2. constant variance (
homoskedasticity
),
3. normal errors
So we should see:
–
a pattern of constant variation around a line
–
very few points more than 2 standard deviations away
b. Why Regression Diagnostics?
Why Should We Care
?
If the model assumptions are violated:
–
Prediction can be systematically biased
–
Standard errors and t-tests wrong
–
someone may be able to beat you with a different and better
model
How can we detect violations of the model?
–
We must use graphical methods
To drive this point home, let’s look at the “famous” Anscomb data
Chapter IV. Slide 9
b. Why Regression Diagnostics?
b. Why Regression Diagnostics?
Chapter IV. Slide 11
b. Why Regression Diagnostics?
b. Why Regression Diagnostics?
Chapter IV. Slide 13
c. Residual Diagnostic Plots
Two basic plots are very useful:
i.
Plot of
Residuals vs. Fitted Values
ii. A
Normal Probability Plot
When Model Assumptions Hold
A First Cut: plot Y against X
(works only when you have one X)1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 3 2 1 x y
This data looks great!
Linear association with constant variance.
Normal?
c. Residual Diagnostic Plots
i. Plot of Residuals vs. Fitted Values
What should this look like?
1.
Residuals should be evenly distributed around the
mean
2.
No relationship between the mean of the residual and
Chapter IV. Slide 15
c. Residual Diagnostic Plots
3 2 1 0 -1 -2 5 4 3 2 1 0 -1 -2 -3 -4 -5 X Y
A key assumption is that the regression model is a linear function.
This is not always true.
c. Residual Diagnostic Plots
3 2 1 0 -1 -2 2 1 0 -1 -2 X sr es id sThere should be no
relationship between the average value of the
Chapter IV. Slide 17
c. Residual Diagnostic Plots
A constant elasticity relationship implies a curved regression function.
c. Residual Diagnostic Plots
ii. A Normal Probability Plots
Use to test normality of residuals. Non-normal residuals cause
the following sorts of headaches:
–
"t-tests" and other associated statistics may no longer be t
distributed
–
Least squares estimates are extremely sensitive to large
ε
iChapter IV. Slide 19
c. Residual Diagnostic Plots
Remember that the salient characteristics of the normal
distribution are thin tails and symmetry.
How can we detect departures from normality?
1 0 -1 -2 20 10 0 n=30 Fr eq ue nc y 2 1 0 -1 -2 -3 30 20 10 0 n=100 Fr eq ue nc y
The most basic analysis would be to graph the histogram of the standardized residuals
Neither of these plots look
particularly symmetric
c. Residual Diagnostic Plots
Let’s compute a norm probablity plot using the
normPlot()
Chapter IV. Slide 21
c. Residual Diagnostic Plots
The
normal probability plot
is a plot of the sample CDF
on a coordinate system in which the normal CDF appears
as a straight line. The sample CDF will appear as a scatter
of points around the normal CDF straight line.
d. Putting It All Together- The Shock Absorber Example
Suppliers for very large manufacturing firms are facing increasing
pressure to assure their parts customers that the parts they produce meet high quality standards.
This supplier is supplying gas-filled shock absorbers.
The data are measurements on the rebound force of the shock
absorber. Measurements can be taken both before and after the shock absorber was fully assembled. It is cheaper to take
measurements of the shock absorber performance before, rather than after, assembly. See dataset shock.
Shock Absorber Example. Slide 23
Basic Model
We must formulate a statistical model to predict rebound force after assembly using the before assembly measurement.
This is a classic example of a regression model!
(
)
b b
s
after 0 1 before
2
Rebound
Rebound
Descriptive Statistics
Shock Absorber Example. Slide 25
Marginal Distribution of Y
Doesn’t look normal! Three clumps!
Joint or Bivariate Distribution
Let’s do a scatter plot. Which variable should be on the Y axis?
Shock Absorber Example. Slide 27
Regression Analysis
Residual Diagnostics
Residuals are much more normal than marginal dist of Y
Shock Absorber Example. Slide 29
T-tests
Suppose before measurements were “perfect” predictors. What would this mean?
One School of Thought:
All you need is very accurate predictions
Another School of Thought (no adjustment):
b
b
0 1 0
A
H :
1.0 and
0
T-tests continued
Let’s test a slight modification:
Since N is relatively small, let’s test at the 10 percent significance level.
b
0 1
A
H :
1.0
H : otherwise
Step 2: Compute t statistic
> t=(.94946-1)/.0438 > t
[1] -1.153881
Step 1: Compute t critical value
> qt(.05,df=33)
[1] -1.692360
Step 3: Compute p-value
> pt(-1.153881,df=33)*2
Shock Absorber Example. Slide 31
Prediction
Glossary of Symbols
X
f- future value of X for forecasting
Y
f- value of Y to be forecasted
Chapter IV. Slide 33
Important Equations
f 1 0 f f ff
Y
Yˆ
Y
b
b
X
e
s
pred s 1
1
N
X
f−
X
(
)
2N
−
1
(
)s
X2Glossary of R Commands