2.7 Summary
3.1.7 The t test — two sided example
Every hypothesis test needs to specify the following elements:
1. The null hypothesis H0: 2. The alternative hypothesis H1: 3. A signi…cance level :
4. A test statistic. (in this case t, but we will see others soon) 5. A decision rule that states when H0 is rejected.
6. The decision, and its interpretation.
Consider the CEO salary regression, which has PRF
E (SalaryijRoEi) = 0+ 1RoEi; (42)
Figure 33: The tn 2 distribution with = 0:05 critical value for testing H0 : 1 = 0 against H1: 1 < 0.
Figure 34: The tn 2 distribution with = 0:05 critical value for testing H0 : 1 = 0 against H1: 1 6= 0.
Figure 35: Choosing to use White standard errors that allow for heteroskedasticity
and the hypotheses H0 : 1 = 0 and H1 : 1 6= 0;so that we are interested in either positive or negative deviations from the null hypothesis, i.e. any role for …rm pro…tability in predicting CEO salaries, whether positively or negatively. We will choose = 0:05, which is the default choice unless speci…ed otherwise.
The test statistic will be the t statistic given in (41). This statistic can be computed in Eviews using either of (37) or (39), with the default choice being (39) which imposes the homoskedasticity assumption. This assumption can frequently be violated in practice, and can be tested for, but we will play it safe for now and use the (37) version of ^!1;n which allows for heteroskedasticity.
This requires an additional option to be changed in Eviews. When specifying the regression in Eviews in Figure 16, click on the “Options” tab to reveal the options shown in Figure 35, and select “White” for the coe¢ cient covariance matrix as shown. The resulting regression is shown in Figure 36, with the selection of the appropriate White standard errors highlighted. We now have enough information to carry out the hypothesis test. The details are as follows.
1. H0: 1 = 0 2. H1: 1 6= 0
3. Signi…cance level: = 0:05 4. Test statistic: t = 2:71
5. Reject H0 if jtj > c0:025= 1:980
6. H0 is rejected, so Return on Equity is a signi…cant predictor for CEO Salary.
The critical value of c0:025= 1:96 is found from the table of critical values on p.833 of Wooldridge, reproduced in Figure 37. For this regression with n = 209, the relevant t distribution has n 2 = 207 degrees of freedom. This many degrees of freedom is not included in the table, so we choose the closest degrees of freedom that is less than this number, i.e. 120. The test is two-sided with signi…cance level of = 0:05, so the critical value of c0:025 = 1:980 can be read from the third column of critical values in the table.
Figure 36: CEO salary regression with White standard errors 3.1.8 The t test — one sided example
The assessment for ETC2410/ETC3440 in semester two of 2013 consisted of 40% assignments during the semester and a 60% …nal exam. Descriptive statistics for these marks, both expressed as percentages, are shown in Figures 38 and 39. It may be of interest to investigate how well assignment marks earned during the semester predict …nal exam marks. In particular, we would expect that those students who do better on assignments during the semester will go on to also do better on their …nal exams. The scatter plot in Figure 40 show that such a relationship potentially does exist in the data, so we will carry out a formal hypothesis test in a regression.
The PRF has the form
E (examijasgnmti) = 0+ 1asgnmti; (43) and we will test H0 : 1 = 0 (that assignment marks have no predictive power for exam marks) against the one-sided alternative H1 : 1 > 0 (that higher assignment marks predict higher exam marks). The estimates are given in Figure 41, in which the SRF is
d
exami = 23:763
(5:360) + 0:548
(0:095)asgnmti:
The numbers in parentheses below the coe¢ cients are the standard errors. This is a common way of reporting an estimated regression equation, since it provides su¢ cient information for the reader to carry out some inference themselves if they wish. The hypothesis test of interest proceeds as follows.
1. H0: 1 = 0 2. H1: 1 > 0
3. Signi…cance level: = 0:05 4. Test statistic: t = 5:766
Figure 37: Critical values from the t distribution from Wooldridge
0
Figure 38: Assignment marks for ETC2410 / ETC3440 in semester two of 2013.
0
Figure 39: Exam marks for ETC2410 / ETC3440 in semester two of 2013.
5. Reject H0 if t > c0:05= 1:662
6. H0is rejected, so there is evidence that higher assignment marks predict signi…cantly higher
…nal exam marks.
The critical value in this case is found from the table in Figure 37 using 90 degrees of freedom (n 2 = 116 in this case) and the column corresponding to the = 0:05 level of signi…cance for a one-sided test.
3.1.9 p-values
A convenient alternative way to express a decision rule for a hypothesis test is to use p-values rather than critical values, where they are available.
First consider testing H0 : 1 = 0 against H1: 1 > 0. The critical value for this t test is c0:05
as shown in Figure 32. Recall that c0:05is de…ned to satisfy Pr (tn 2> c0:05) = 0:05, which means that the area under the tn 2distribution to the right of c0:05is 0.05. Any value of the test statistic t that falls above c0:05 leads to a rejection of the null hypothesis, and the area under the tn 2
distribution to the right of such a value of t must be less than 0.05. So instead of de…ning a decision
0 20 40 60 80
10 20 30 40 50 60 70 80 90
ASGNMT
EXAM
Figure 40: Scatter plot of exam marks against assignment marks.
Figure 41: Regression of exam marks on assignment marks
rule in terms of t > c0:05, we could equivalently de…ne decision in terms of Pr (tn 2 > t) < 0:05.
That is the decision rules “reject H0 if t > c0:05” and “reject H0 if Pr (tn 2> t) < 0:05” yield identical tests. Similarly if we are testing H0 : 1 = 0 against H1 : 1 < 0, the decision rules
“reject H0 if t < c0:05” and “reject H0 if Pr (tn 2< t) < 0:05” yield identical tests.
For the two sided problem H0 : 1 = 0 against H1 : 1 6= 0, the decision rule is “reject H0
if jtj > c0:025”. Recall that c0:025 is de…ned to satisfy Pr (tn 2> c0:025) = 0:025, see Figure 34.
The condition jtj > c0:025 therefore implies that Pr (tn 2> jtj) < 0:025, because jtj is further out into the tail of the tn 2 distribution than c0:025. Multiplying this inequality by 2 gives 2 Pr (tn 2 > jtj) < 0:05, so the critical value decision rule “reject H0 if jtj > c0:025” is equivalent to “reject H0 if 2 Pr (tn 2> jtj) < 0:05”.
It is conventional in econometrics and statistics (and in Eviews!) to de…ne the p-value for a regression t statistic as
p = 2 Pr (tn 2> jtj) : (44)
Therefore the decision rule for testing H0 : 1 = 0 against H1: 16= 0 is
“reject H0 if p < 0:05”,
where p is the value printed out by Eviews under the “Prob.” column of the regression output.
The two-sided test of the signi…cance of RoEi in the model (42) can be re-expressed in terms of p values as follows.
1. H0: 1 = 0 2. H1: 1 6= 0
3. Signi…cance level: = 0:05 4. Test statistic: p = 0:0073 5. Reject H0 if p < 0:05
6. H0 is rejected, so Return on Equity is a signi…cant predictor for CEO Salary.
The value in item 4 is read directly from the regression output in Figure 36. Clearly having a p-value available makes the hypothesis test more convenient to carry out because it is not necessary to look up or compute a critical value. The vast majority of hypothesis tests computed in modern econometrics and statistics software are accompanied by a p-value for easy testing.
For testing against one-sided alternative hypotheses, a small modi…cation is required. In the introductory discussion it was shown that the decision rule for testing H0 : 1 = 0 against H1 :
1 > 0 is to “reject H0if Pr (tn 2> t) < 0:05”. If t 0 then Pr (tn 2> t) = Pr (tn 2> jtj) = p=2 using (44). On the other hand if t < 0 then Pr (tn 2 > t) = 1 Pr (tn 2> jtj) > 0:5 (by the symmetry of the tn 2 distribution) so H0 will never be rejected if t < 0. This makes intuitive sense since t < 0 can only occur if ^1 < 0, and an estimate of ^1 < 0 cannot provide evidence to reject H0 : 1 = 0 in favour of H1 : 1 > 0. So the decision rule for testing H0 : 1 = 0 against H1: 1 > 0 is “reject H0 if t > 0 and p=2 < 0:05”, or more simply
“reject H0 if t > 0 and p < 0:10”.
That is, to carry out a one-sided test at the 5% level of signi…cance, the comparison of the p-value is made with 0.10 not 0.05. The reason is that the p-value provided by Eviews is (44), which is for testing against two-sided alternatives.
The decision rule for testing H0 : 1 = 0 against H1 : 1 < 0 is a mirror image of the upper-tailed version, that is
“reject H0 if t < 0 and p < 0:10”,
so that the null is rejected only for negative estimates of 1 whose p-value is less than 0.10.
The one-sided test of the signi…cance of assignment marks in (43) can therefore be re-expressed as follows.
1. H0: 1 = 0 2. H1: 1 > 0
3. Signi…cance level: = 0:05
4. Test statistic: t = 5:766, p = 0:0000 5. Reject H0 if t > 0 and p < 0:10
6. H0is rejected, so there is evidence that higher assignment marks predict signi…cantly higher
…nal exam marks.
The outcome of a hypothesis test carried out using critical values or p-values will always be the same, the choice comes down to one of convenience. Most often p-values are more convenient and most often used in practice, and we will generally rely on them from now on.
3.1.10 Testing other null hypotheses
By far the most common hypotheses testing in regression models have the null in the form H0 :
1 = 0. However there are other null hypotheses that can also be of interest. For example, in the exam marks application we might want to test whether an extra 1% gained on assignment marks predicts an extra 1% gained on the …nal exam. In the regression model (43), this would translate to a null hypothesis of the form H0: 1 = 1.
In general, consider testing a null hypothesis of the form H0: 1= b1, where b1 is a speci…ed number (eg, 0,1, etc). The t statistic for testing this null hypothesis is
t =
^1 b1
^
!1;n
: (45)
Obviously this reduces to (41) when b1 = 0. The decision rules presented above remain unchanged, both for critical values and p-values.
To illustrate, consider testing H0 : 1 = 1 against H1 : 1 6= 1 in the exam regression (43).
From the results in Figure 41 we can calculate t = 0:548 1
0:095 = 4: 758: (46)
The hypothesis test using critical values can then proceed as follows.
1. H0: 1 = 1 2. H1: 1 6= 1
3. Signi…cance level: = 0:05
4. Test statistic: t = 4:758 5. Reject H0 if jtj > c0:025= 1:987:
6. H0 is rejected, so the predicted change in …nal exam scores corresponding to a 1% higher assignment score is signi…cantly di¤erent from 1%.
Note that a two-sided alternative was used in this case because there is no prior expectation before the analysis whether the coe¢ cient should be greater than one or less than one. Having estimated the regression it appears the coe¢ cient is less than one, but we must not use that information to formulate the alternative hypothesis.
In order to avoid re-calculating the t statistic manually as we did above, and also in order to obtain convenient p-values, the regression model can be re-estimated in a form that makes testing a null hypothesis H0 : 1 = b1 very easy. Suppose in general we have a PRF of the form
E (yijxi) = 0+ 1xi: Subtracting b1xi from both sides gives
E (yi b1xijxi) = 0+ ( 1 b1) xi
= 0+ 1xi:
The null hypothesis H0: 1= b1in the original regression of yi on xican therefore be equivalently re-expressed as H0 : 1 = 0 in the regression of (yi b1xi) on xi.
For testing H0: 1= 1 against H1 : 1 6= 1 in the exam regression (43), the PRF is re-written E (exami asgnmtijasgnmti) = 0+ ( 1 1) asgnmti
= 0+ 1asgnmti:
The results from regressing (exami asgnmti) on asgnmti are given in Figure 42, which shows that ^1 = 0:452 with t = 4:746. (This latter t statistic di¤ers from (46) only because of the rounding error induced by the calculation in (46) being carried out using three decimal places in numerator and denominator. Without this rounding error, the two would be identical.) The hypothesis test in terms of p-values then proceeds as follows.
1. H0: 1 = 1 2. H1: 1 6= 1
3. Signi…cance level: = 0:05 4. Test statistic: p = 0:0000 5. Reject H0 if p < 0:05
6. H0 is rejected, so the predicted change in …nal exam scores corresponding to a 1% higher assignment score is signi…cantly di¤erent from 1%.
The same conclusions will always be found from this approach (re-specifying the regression) and the previous approach that manually computes the t statistic and uses a critical value. It will usually be more convenient to practice to re-specify the regression and use the p-value that is then automatically provided.
Figure 42: Regression of (exami asgnmti) on asgnmti
3.2 Con…dence intervals
Con…dence intervals provide an alternative method for summarising the uncertainty due to sam-pling in coe¢ cient estimates. A con…dence interval is a pair of numbers that form an interval within which the true value of the parameter is contained with a pre-speci…ed probability. This probability, called the con…dence level, is typically chosen to be 1 , where is the usual signif-icance level used in hypothesis tests. So, for a regression coe¢ cient 1, the aim is to …nd numbers
1 and 1 such that
Pr 1 1 1 = 1 : (47)
The derivation of the con…dence interval follows from hypothesis tests of the form H0 : 1 = b1 against H1 : 1 6= b1. If we imagine testing these hypothesis for all possible values of b1, the con…dence interval is formed by those values of b1 for which H0 : 1 = b1 is not rejected using a two-sided t test with signi…cance level . To show where this leads, for any b1 the null hypothesis H0: 1 = b1 is not rejected if the t statistic in (45) satis…es jtj c =2, which implies
c =2
^1 b1
^
!1;n c =2: These two inequalities can be re-arranged to give
^1 c =2!^1;n b1 ^
1+ c =2!^1;n: (48)
That is, H0 : 1 = b1 will not be rejected for all b1 in the interval h
1; 1i
=h
^1 c =2!^1;n; ^1+ c =2!^1;ni
; (49)
which is the desired con…dence interval. It has the desired level because when b1 is the true value of the parameter, the null hypothesis H0 : 1 = b1 is rejected with probability (this is the de…nition of the signi…cance level of the test), which implies that it is not rejected with probability
1 . Therefore the true value 1 is included in the con…dence interval (49) with probability 1 as required.
To illustrate, consider a con…dence interval for the slope coe¢ cient in the salary PRF (42).
From the results in Figure 36 we see that ^1 = 18:501 and ^!1;n= 6:829. The critical value for a two-sided t test with signi…cance level = 0:05 is c0:025= 1:980. The 95% con…dence interval for
1 is therefore h
1; 1i
= [18:501 1:980 6:829; 18:501 + 1:980 6:829]
= [4:980; 32:022] : (50)
The interpretation of this interval is that it contains the true value of 1 with probability 95%.
(In fact this probability of 95% is an approximation because the distribution of t in (38) on which it is based is also approximate. In practice though we usually just talk about a “95%
con…dence interval”, rather than an approximate or asymptotic 95% con…dence interval.) The 95% con…dence interval (or “interval estimate”) of the coe¢ cient implies that an increase of 1%
in a …rm’s Return on Equity predicts an increase in CEO salary of between $4,980 and $32,022.
A con…dence interval provides a convenient and informative way to report the …ndings of a regression. The mid-point of the interval is the point estimate ^1, while its width represents how much uncertainty there is about the estimate. A narrow con…dence interval implies the sample has provided a precise estimate of the coe¢ cient. The width of the con…dence interval is determined by the standard error ^!1;n, so a small standard error implies a precise estimate and a narrow con…dence interval.
From a hypothesis testing perspective, the con…dence interval provides a nice summary of all the null hypotheses that would not be rejected by a two-sided t test (those values within the interval) and all of the null hypotheses that would be rejected (those values outside the interval).
Clearly this is much more informative than simply reporting a coe¢ cient estimate and whether or not it is signi…cantly di¤erent from zero (which does happen sometimes...). A con…dence interval that does not include zero, such as (50) above, immediately conveys the information that the coe¢ cient estimate is signi…cantly di¤erent from zero, but it contains much more information as well.
These ideas also emphasise why in a hypothesis test we never claim to “accept H0”, we only say that we “do not reject H0”. Consider the con…dence interval h
1; 1i
= [4:980; 32:022]
constructed above. This implies that H0 : 1 = b1 would not be rejected for all b1 between 4.980 and 32.022. It would be illogical to say that we accept H0 : 1 = 5 and H0 : 1 = 10 and H0 : 1 = 25 and so on, we cannot accept that 1 is equal to several di¤erent values at once!
Instead we say that the sample does not provide su¢ cient evidence to reject those values at the speci…ed level of signi…cance.
3.3 Prediction intervals
Suppose we want to make a prediction for yi for a particular …xed value x of xi. For example, to predict average CEO salary for Return on Equity of x = 15%, or …nal exam marks for an assignment mark of x = 75%. The prediction is given by
^
y (x) = ^0+ ^1x; (51)
and this can be taken as an estimator of the true value
y(x) = E (yijxi= x) = 0+ 1x:
Just like a con…dence interval for the population parameter 1, a prediction interval can be calculated for the population conditional mean E (yijxi= x), i.e. an interval h
y(x); y(x)i such that
Pr y(x) y(x) y(x) = 1 ; compare to (47) for 1.
The distribution of ^y (x) as an estimator of y(x) can be derived to be
^ and an;i was given in (24). This leads to the prediction interval
h
Fortunately there is a convenient way to calculate ^!2n; without dealing with the formula. If we take the usual SRF ^yi = ^0+ ^1xi and subtract the prediction formula at x given by (51), we obtain
^
yi= ^y (x) + ^1(xi x) :
This shows that an OLS regression of yi on an intercept and (xi x) will provide an intercept that corresponds to ^y (x), and then ^!n; required for the con…dence interval is simply the standard error on this estimate.
As an example, consider making a prediction for the average …nal exam mark for an assignment mark of x = 75%. A regression in Eviews speci…ed as “exam c (asgnmt-75)” will produce an intercept corresponding to ^y (75). The Eviews output is shown in Figure 43. The prediction is
^
y (75) = 64:90%, with standard error ^!n; = 2:34. The 95% prediction interval based on (52) is therefore
where c0:025 = 1:987 is obtained from the t distribution table with 90 degrees of freedom (the closest to n 2 = 116 in this example). The interpretation of this interval is that it contains the population conditional mean y(75) = E (examijasgnmti = 75) with probability of 95%.
3.3.1 Derivations
These derivations of the distribution of the prediction follow easily from the preceding derivations we did for y and ^1, but this subsection is not required for the course.
First recall the representation
^1= Xn i=1
an;iyi;
Figure 43: Predicting …nal exam mark for an assignment mark of 75%
and substituting for ^0 and ^1 into (51) gives
^
The variance is
using LIEvar, which can be estimated by
^
where ^ui are the OLS residuals.
The approximate normality of ^y (x) follows from the Central Limit Theorem.
4 Multiple Regression
An extremely useful feature of regression modelling is that it easily allows for the inclusion of more than one explanatory variable. This is very useful for interpreting the roles of individual explanatory variables and potentially for improving predictions. The techniques for OLS esti-mation and inference that we have discussed for simple regression extend straightforwardly to the multiple regression setting. The models and methods will be discussed here, with formulae postponed until the section on matrix notation for regression.
4.1 Population Regression Function
A linear PRF with multiple explanatory variables x1;i; : : : ; xk;i takes the form
E (yijx1;i; : : : ; xk;i) = 0+ 1x1;i+ : : : + kxk;i: (53) That is, the population conditional mean of yi given x1;i; : : : ; xk;i is speci…ed as a weighted sum of x1;i; : : : ; xk;i.
The interpretation of the coe¢ cients 1; : : : ; k is similar to that in a simple regression, with an important quali…cation. To interpret 1, consider the predicted value of yi with x1;i increased by one unit and with x2;i; : : : ; xk;i unchanged:
E (yijx1;i+ 1; : : : ; xk;i) = 0+ 1(x1;i+ 1) + : : : + kxk;i: Then
E (yijx1;i+ 1; : : : ; xk;i) E (yijx1;i; : : : ; xk;i) = 1;
so that we interpret 1 as the change in the prediction of yi corresponding to a one unit increase in x1;i, holding x2;i; : : : ; xk;i constant. This aspect of holding all of the other explanatory variables constant leads to the regression coe¢ cient being called a marginal e¤ ect or partial e¤ ect. In gen-eral, for any j = 1; : : : ; k, the parameter j is the change in the predicted value of yicorresponding to a one unit increase in xj;i, holding xh;i constant for all h 6= j.
The intercept 0 only has a meaningful interpretation if it makes sense for all of x1;i; : : : ; xk;i
to take the value zero. In that case 0 is the predicted value of yi when x1;i= : : : = xk;i= 0.
4.2 Sample Regression Function and OLS The SRF that estimates the PRF in (53) is
^
yi= ^0+ ^1x1;i+ : : : + ^kxk;i; (54)
where ^0; ^1; : : : ; ^k are the values that minimise the sum of squared residuals ex-pressed in matrix notation later. The OLS residuals are denoted
^ The R2 for the regression is
R2= SSE
which has the same derivation, properties and interpretation as the R2 in a simple regression.
That is, R2 measures the proportion of the variance in yi explained by the regression.
4.3 Example: house price modelling
An example data set from Chapter 4 of Wooldridge contains the following data on house prices and explanatory variables.
price : selling price of the house ($’000) assess : assessed value prior to sale ($’000) lotsize : size of the block in square feet
price : selling price of the house ($’000) assess : assessed value prior to sale ($’000) lotsize : size of the block in square feet