56 CHAPTER 2 Review of Probability
2.5 Suppose that K , . . . ,Yn are i.i.d. random variables with a N(l. 4) distribu-tion. Sketch the probability density of Y when n — 2. Repeat this for n = 10 and n = 100. In words, describe how the densities differ. What is the rela-tionship between your answer and the law of large numbers?
2.6 Suppose that Y\,...,Yn are i.i.d. random variables with the probability dis-tribution given in Figure 2.10a. You want to calculate Pr( Y < 0.1). Would it be reasonable to use the normal approximation if n -- 5? What about n = 25 or n - 100? Explain.
2.7 Y is a random variable with ixY = 0,aY— 1, skewness - 0, and kurtosis = 100. Sketch a hypothetical probability distribution of Y. Explain why n ran-dom variables drawn from this distribution might have some large outliers.
E x e r c i s e s a . yS\ *\\s&< <?stJc C& O 1+ * V\* V4.J • \ y XU ^\< i* ttf 7 * 'v*2:
Let Y denote the number of "heads" that occur when two coins are tossed. a. Derive the probability distribution of Y. -Mi. \*± A \: Vi j b. Derive the cumulative probability distribution of Y.
c. Derive the mean and variance of Y. . r ., ,, - .-, t
Use the probability distribution given in Table 2.2 to compute (a) E( Y) and E(X); (b) ax and ay\ and (c) tryy and corr(.\', Y).
^ " , - _£ r- * t.^ _«•.*. ^\ 2 ® Using the random variables X and Y from Table 2.2, consider two new ran-^ r J* - tCtf* ~ ran-^ - . < -v^ d o m variables W ~ 3 + 6X and V -- 20 - 1Y. Compute (a) E(W) and E{V)\
(b) o-£ and o-^; and (c) tr^y and corr(W, V).
^A 2.4 Suppose X is a Bernoulli random variable with P{X = 1) -= p. 3 ^ t T ^ V ^ < ^ M ^ " i n-* a* Showi?(,V3)=p.
CT^ E [ ^ Pl-t" •« M * V« M I A U owE(A'*)=pforA; > 0.
a . r j-i r j-' arf ^ . ^ o
1. ,A 9C c. Suppose that p - 0.3. Compute the mean, variance, skewness, and kur-[>-^v tosis of X. (Hint: You might find it helpful to use the formulas given in %v eWG* * oKiON1, - o Exercise 2.21.)
^ j ^ ^ 2.5 In September, Seattle's daily high temperature has a mean of 70°F and a o ^ = t Dj ^-{oX^o-nr ( ^ to' n W^ 'x^ t a n d a r d deviation of 7°FWhat are the mean, standard deviation, and
vari-ance in °C?
2JS) The following table gives the joint probability distribution between employ-ment status and college graduation among those either employed or look-ing for work (unemployed) in the worklook-ing age U.S. population for 2008.
Exercises 57 « S M Unemployed Employed <Y= 0) <K= 1) Total Non-college grads (X = 0) College grads(A'=l) Total 0 037 0 009 0 046 0.622 0.332 0.954 0.659 0.341 1.000 0i. V .«&<• ' ^ j j * ^ K V 33? a. Compute E(Y). 0^5^
b. The unemployment rate is the fraction of the labor force that is ^ <^fit: unemployed. Show that the unemployment raje is given by 1 E{ Y) Calculate E{Y\X = 1) and £ ( ^ 1 ^ = 0 ) . ^ ^ ^ ;
• y^1 d. Calculate the unemployment rate for (i) college graduates and
? ^ * \ * ° V .»* : - ^ , c l i m a t e Km r = n a n d £ f y i ^ = o i .T^ / ' " - T - > ' v * * ' • » ' - ^
- _ o * ^ (ii) non-college graduates. ^l"-v52
to-lw, - ^ - . j . ^ Q,w ^ t l ^ \ ; ^ e. A randomly selected member of this population reports being
unem-C\ -%A. ployed. What is the probability that this worker is a college graduate? A non-college graduate? ^ V \ \ - - ^ ^ \ ^ - - ° ^ XG!_,
e ^VN
£(S \^. \\ t). ^^^A1^-*^*1 f- Are educational achievement and employment status independent? \ ^< \. " 1 explain " *MHV i yw*<\ «w*vt6. -^o^A N^ ^^ -qs^* t^.^vs^ ^ ^v>o:fr^
c v w ^
VA ^ ^ U V ^ . ? ^ x . ^^= V i t ^ x \ 7 M*; ^ ~ "* cT-^ " ^ ^ ^ 2.7 In a given population or two-earner male/female couples, mare earnings
N : -ie-^ ' ^ ^ have a mean of $40,000 per year and a standard deviation of $12,000. Female earnings have a mean of $45,000 per year and a standard devia-tion of $18,000. The correladevia-tion between male and female earnings for a couple is 0.80. Let C denote the combined earnings for a randomly selected couple.
a. What is the mean of C?
b. What is the covariance between male and female earnings? c. What is the standard deviation of C?
d. Convert the answers to (a) through (c) from U.S. dollars ($) to euros ( €) .
2.8 The random variable Y has a mean of 1 and a variance of 4. Let Z = \{Y- 1). Show that \LZ - 0 and tr\ = 1.
58 CHAPTER 2 Review of Probability x Exercises 59 r Value of X -1 5 8 14 0.02 0.17 0.02 22 0.05 0.15 0.03 Value of Y 30 0.10 0.05 0.*5 40 0.03 0.02 0.10 ~\ 65 0.01 0.01 0.09 M-\,cr £ 4TS ™ : - i - \ ^ 1 ^ ,/ 4 * > ^ 3 r N "t I 5 cf 2, ' A ^ _ / ^ s D \*' w \2 - , - ^ ' . ^ \ ."V*\\ " J i TT r ^ s 1 2 T h a t i s , P r ( A = l , Y - 1 4 ) - 0.02, and so forth.
a. Calculate the probability distribution, mean, and variance of Y. b. Calculate the probability distribution, mean, and variance of Y given
A " - 8 .
c. Calculate the covanance and correlation between A'and Y. 205^ Compute the following probabilities:
g ^ - l _ V v \ ' " p . If Yis distributed N(l,4),iind Pr(Y ^ 3). b. If Yisdistributed;V(3,9),findPr(y > 0).
\ W ^ l c. If Y is distributed /V(50.25),find Pr(40 < Y < 52). d. If Y is distributed N(5.2). find Pr(6 < Y < 8). 2.11 Compute the following probabilities:
a. If Y is distributed * i find Pr( Y < 7.78). p ^ - ^ p - V ^ \ b. If y is distributed ^ n . find P r ( K > 18.31).
~~ c. If y is distributed Fm^, find Pr( y > 1.83). d. Why are the answers to (b) and (c) the same?
e. If Vis distributed*2!, find P r ( y < 1.0). (Hint: Use the definition of the x\ distribution.)
2.12 Compute the following probabilities: a. If y is distributed r15, find Pr( Y > 1.75).
— b. If y is distributed f90,find Pr(-1.99 < Y < 1.99). c. If y is distributed N(0,1), find Pr(-1.99 < Y < 1.99) d. Why are the answers to (b) and (c) approximately the same? e. If Y is distributed F7.4, find Pr(Y > 4.12).
f. If y is distributed F%m, find P r ( Y > 2.79). t-z \c\ V \ a- fa /" lO\_ v rW ^ -r
2.13 A is a Bernoulli random variable with P r ( A ^ 1) = 0.99, Y is distributed N(0,1), W is distributed N(0,100), and A, Y, and W are independent. Let " 7 S - A T + ( l - A ) R ( T h a t i s , S - Y w h e n A = l , a n d £ = W w h e n A = 0.) ; [ *51, ^ r ^ ^ b a. Show that £(Y2) = 1 and £(W2) = 100.
b. Show that £( y3) - 0 and £ ( IV3) - 0. (ffinc What is the skewness for a symmetric distribution?)
c Show that E( Y4) ~ 3 and E( W4) = 3 X 1002. {Hint: Use the fact that \\ \ the kurtosis is 3 for a normal distribution.)
<\T- !>C d* D e r i v e £ (5) - £( ^2) , £(S3) and £ ( 54) . (Hint: Use the law of iterated
"*** expectations conditioning on A ^ 0 and A = 1.) . * AA e. Derive the skewness and kurtosis for S.^ ^^ ***? ^C-N^ fi-o^^ N -^ -HKJ 2.1^<In a population /xy = 100 and ay ~- 43. Use the central limit theorem t&
answer the following questions:
4 i ^ - i - " 1 "U V-. ^
a. In a random sample of size n = 100, find Pr( V < 101). b. In a random sample of size n = 165, find Pr( Y > 98).
c. In a random sample of size n = 64, find Pr( 101 ^ Y < 103).
2CXS Suppose Y{, i -= 1 , 2 , . . . , n, are i.i.d. random variables, each distributed N(10,4).
a. Compute Pr(9.6 , < J s 10.4) when (i) n =-- 20, (ii) n = 100, and
b. Suppose c is a positive number. Show that Pr(10 - c < Y < 10 + c) . becomes close to 1.0 as n grows large. - -^* 2 ^ ^ . ^*Y * *" c. Use your answer in (b) to argue that Y converges in probability to 10. 2.16 Y is distributed W(5,100) and you want to calculate Pr( Y < 3.6). Unfor-tunately, you do not have your textbook and do not have access to a nor mal probability table like Appendix Table 1. However, you do have your computer and a computer program that can generate i.i.d. draws from the N(5,100) distribution. Explain how you can use your computer to compute an accurate approximation for Pr(Y < 3.6).
2.17 Yfri — l, .., «, are i.i.d. Bernoulli random variables with p - 0.4. Let Y denote the sample mean.
a. Use the central limit to compute approximations for i. Pr( Y ^ 0.43) when n - 100.
60 CHAPTER 2 Review of Probability Exercises 61
"V
b. How large would n need to be to ensure that Pr(0.39 s Y < 0.41) > 0.95? (Use the central limit theorem to compute an approximate answer.)
2.18 In any year, the weather can inflict storm damage to a home. From year to year, the damage is random. Let Y denote the dollar value of damage in any given year. Suppose that in 95% of the years Y ^ $0, but in 5% of the years Y - $20,000.
a. What are the mean and standard deviation of the damage in any year? b. Consider an "insurance pool" of 100 people whose homes are
suffi-ciently dispersed so that, in any year, the damage to different homes can be viewed as independently distributed random variables. Let Y denote the average damage to these 100 homes in a year, (i) What is the expected value of the average damage Y? (ii) What is the proba-bility that Yexceeds $2000?
2.19 Consider two random variables X and Y. Suppose that Y takes on k values y\,>..,yk and that A takes on / values x\ ,X[.
a. Show that Pr( Y - y}) = S L i P r ( Y - y/| A ' ^ j :I- ) Pr(A = x,). [Hint: Use the definition of Pr(Y=-y}\X~*,-).]
b. Use your answer to (a) to verify Equation (2.19).
c. Suppose that X and Y are independent. Show that <TXY = 0 and corr(A, Y) - 0.
2.20 Consider three random variables A, Y, and Z. Suppose that Y takes on k values yy,..., yk, that A takes on / values xh..., xh and that Z takes on m values zL>...,im. The joint probability distribution of A, Y, Z is Pr(A - x, Y - y,Z — z). and the conditional probability distribution of Y given A and Z is Pr( Y - y \ X = x, Z = z) - -^T^x^^z^z) U
-a. Explain how the marginal probability that Y = y can be calculated from the joint probability distribution. [Hint: This is a generalization of Equation (2.16).]
b. Show that E( Y) = E[E(Y'\X, Z)\ [Hint: This is a generalization of Equations (2.19) and (2.20).]
2.21 A is a random variable with moments E(X), E(X2), E(X3), and so forth. a. ShowE(X~ ti)^E(X3)~3[£(X2)][E(X)] f 2[£(A)]3.
b. Show E(X - fi)4 = E(X4) - 4[E(A)1[E(A3)] -f 6[E(X)f[E(X2)] 3[E(X)f.
2.22 Suppose you have some money to invest —for simplicity. $1 -and you are planning to put a fraction w into a stock market mutual fund and the rest, 1 - w, into a bond mutual fund. Suppose that $1 invested in a stock fund yields Rs after 1 year and that $1 invested in a bond fund yields Rb, sup-pose that Rs is random with mean 0.08 (8%) and standard deviation 0.07, and suppose that Rb is random with mean 0.05 (5%) and standard devia tion 0.04. The correlation between Rs and Rb is 0.25. If you place a fraction w of your money in the stock fund and the rest, 1 — w, in the bond fund, then the return on your investment is R = wRs + (1 - w)Rb.
a. Suppose that w -•= 0.5. Compute the mean and standard deviation of R. b. Suppose that w ~ 0.75. Compute the mean and standard deviation of R.
c. What value of w makes the mean of R as large as possible? What is the standard deviation of R for this value of w?
d. (Harder) What is the value of w that minimizes the standard devia-tion of Rl (Show using a graph, algebra, or calculus.)
2.23 This exercise provides an example of a pair of random variables A and Y for which the conditional mean of Ygiven Adependson AbutcorrfA, Y) = 0. Let X and Z be two independently distributed standard normal random variables, and let Y -= X2 + Z.
a. S h o w t h a t Z f ( Y | A ) = A2 b. Show that ixYLT 1.
c Show thai E(AY) = 0. (Hint: Use the fact that the odd moments of a standard normal random variable are all zero.)
d. Show that cov(A, Y) ~- 0 and thus corr(A, Y) -- 0. 2.24 Suppose Yl is distributed i-i.d. N(0, a2) for i - 1, 2 , n.
a. S h o w t h a t £ ( Y ? / o -2) - l .
b. Show that W - (l/o-2)^"-! Y2 is distributed Xl-c. Show that E( W) = n. [Hint: Use your answer to (a).] d. Show that K - Y j n --1 is distributed t„
2.25 (Review of summation notation.) Let x-, .., xn denote a sequence of num-bers,^,. . .,yn denote another sequence of numbers, and a, 6, and c denote three constants. Show that
62 CHAPTER 2 Review of Probability »• ^axi^a^xt HI i=] HI (=i J=I c. Xa «a HI n d. y ( f l + ^ + c > -I)2= « f l2 + ft2^-l-c22)'/+2aft2^+2flc2>'/+
/=i /=] i=i i=i HI
Ibc^XM
2.26 Suppose that Yu Y2,.. , Yn are random variables with a common mean p,y, a common variance oy, and the same correlation p (so that the correlation between Yt and Y} is equal to p for all pairs i and /, where / # ;).
a. Show that cov( Y^ Yj) -* pa} for /" # ;.
b. Suppose that n = 2. Show that £ ( Y ) - /Ay and var(Y) ^^Y^IP^Y-c. Forrc > 2,showthat £ ( Y ) = p ,ya n d var(Y) = cry/« J- [(« - l)fn]p<TY. d. When « is very large, show that var( Y) » po-y.
2.27 A and Z are two jointly distributed random variables. Suppose you know the value of Z, but not the value of A Let A = E( X\ Z) denote a guess of the value of X using the information on Z, and \etW = X-X denote the error associated with this guess.
a. Show that E(W)=Q. (Hint: use the law of iterated expectations.) b. Show that £ ( W Z ) = 0 .
c. Let X = g(Z) denote another guess of A using Z, and V - A A denote its error. Show that E(V2) > £(W2). [//mf:Let h(Z) = g(Z)- E ( A j Z ) , s o t h a t K = [ A - E(X\Z)}-h(Z). Derive E(V2).]
2 . 1 D e r i v a t i o n o f R e s u l t s i n K e y C o n c e p t 2 . 3 This appendix derives the equations in Key Concept 2.3.
Equation (2.29) follows from the definition of the expectation.
To derive Equation (2.30), use the definition of the variance to write var(a + bY) £{[fl-^6Y-£(fl + 6Y)]2} = £{[6(Y-/*r)l2» = 62f i [ ( Y - ^ )2] = 62^ .
m 4
t
<
Derivation of Results in Key Concept 2.3 63 To derive Equation (2.31), use the definition of the variance to write
var(oA + bY) = E{[(aX+bY) - (anx-*-bfir)f} = E{[a(X - ^ ) + 6(Y-/*y)]2}
= E[a2(X - fxx)2} + 2E\ab(X- fix)(Y - ixY)] + E[b2(Y ixY)2)
= a2var(A) ^2a6cov(A, F) + 62 var(K)
= a2ax + 2abaxY + b2a^ (2.49)
where the second equality follows by collecting terms, the third equality follows by expand-ing the quadratic, and the fourth equality follows by the definition of the variance and covariance.
To derive Equation (2.32), write E(Y2) = E{[(Y~fj.Y) + Ay]2} = E[{Y~py)2] + 2 i±YE{ Y-py) + fiY=o-Y + fiy because E(Y- t±Y) =• 0.
To derive Equation (2.33J, use the definition of the covariance to write covffl + bX + cV, Y) = E{[a + bX + cV- E{a + bX + cV)][Y- fj.Y]}
= E{\b(X-fix)+c(V fxv)}[Y-nY}\
= E {[b(X ^ ) ] [ Y -My ] } + E {[c(V- fLV)\[Y-pY]}
- bam + ca-yy, (2.50)
which is Equation (2.33).
To derive Equation (2.34), write E(XY) = E{[(X- p.x) + fj,x][{Y fiy) + fiY]} = E[(X - i±x)(Y- tiy)} + /xxE(Y - fiY) + fj.YE(X- fxx) 4- /xxfXy = rrxy + (JLX I±Y.
We now prove the correlation inequality in Equation (2.35); that is, j corr (A, X)' s 1. Let a = -<rXY/ vx and b = 1. Applying Equation (2.31), we have that
var(aA +Y)= a2ax + try 4 2aaXY
= (-VXYIVXY <rx + cr$h 2(-<rXY/ax)<TXY
= <Ty- (Txy/al (2.51)
Because varfaA' 4- Y) is a variance, it cannot be negative, so from the final line of Equa-tion (2.51) it must be that aY - <rXY/a-x ^ 0. Rearranging this inequality yields
airy s (TX(Ty (covarianceinequality). (2.52) The covariance inequality implies that o-XY/(axaY) < 1 or, equivalent!^ \crXY/(frxa'Y)\ — 1. which (using the definition of the correlation) proves the correlation inequality, | corr (A'Y)| < 1.
0 *1 t * 96 CHAPTER 3 Review of Statistics
causal effect (84) treatment effect (84) scatterplot(91)
sample covariance (91) sample correlation coefficient
(sample correlation) (92) power of a test (77)
one-sided alternative hypothesis (79) confidence set (79)
confidence level (79) confidence interval (79) coverage probability (81)
test for the difference between two means (81)
R e v i e w t h e C o n c e p t s
3.1 Explain the difference between the sample average Y and the population mean.
3.2 Explain the difference between an estimator and an estimate. Provide an example of each.
3.3 A population distribution has a mean of 10 and a variance of 16. Determine the mean and variance of Y from an i.i.d. sample from this population for (a) n = 10; (b) n — 100; and (c) n = 1000. Relate your answers to the law of large numbers.
3.4 What role does the central limit theorem play in statistical hypothesis test-ing? In the construction of confidence intervals?
3.5 What is the difference between a null and alternative hypothesis? Among size, significance level, and power? Between a one-sided alternative hypoth-esis and a two-sided alternative hypothhypoth-esis?
3.6 Why does a confidence interval contain more information than the result of a single hypothesis test?
3.7 Explain why the differences-of-means estimator, applied to data from a randomized controlled experiment, is an estimator of the treatment effect. 3.8 Sketch a hypothetical scatterplot for a sample of size 10 for two random
vari-ables with a population correlation of (a) 1.0; (b) -1.0; (c) 0.9; (d) 0.5; (e) 0.0.
Exercises
(?.i In a population, fiy — 100 and <rY = 43. Use the central limit theorem to answer the following questions:
,v/\S V * 30. W z \ .' t ' -t 5 X \ ,K, f 4c' ICO 1% "H & , , x- - -K ' " * v n Exercises 97 i_ -* -i
a. In a ranSoni samplritf size n^ 100, find Pr(Y < 101). .«?> ')
b. In a random sample of size\n =64JundPr( 101 < Y < 103). WaA*»©S4A c In a random sample of size n — 165, rind Pr( Y > 98). \
^ 3.2 IA\ Ybe a Bernoulli random variable with success probability Pr( Y = 1) - p, ($ w*> 4C\_ and let Y , . . . , Yn be i.i.d. draws from this distribution. L e t p be the fraction
U , a ?
is io v\
of successes (Is) in this sample. a. Show that p -= Y.
b. Show that p is an unbiased estimator of p. c. Show that var(p) = p ( l - p)/n.
3.3 In a survey of 400 likely voters, 215 responded that they would vote for the incumbent and 185 responded that they would vote for the challenger. Let p denote the fraction of all likely voters who preferred the incumbent at the time of the survey, and let p be the fraction of survey respondents who preferred the incumbent.
a. Use the survey results to estimate p.
b. Use the estimator of the variance o£p,p(l — p)/n. to calculate the standard error of your estimator.
c. What is thep-value for the test 7 /0: p ^ 0.5 vs. H::p ^ 0 . 5 ? d. What is the p-value for the test HQ: p = 0.5 vs. Hj. p > 0.5? e. Why do the results from (c) and (d) differ?
f. Did the survey contain statistically significant evidence that the incumbent was ahead of the challenger at the time of the survey? Explain.
3.4 Using the data in Exercise 3.3:
a. Construct a 95% confidence interval for p. b. Construct a 99% confidence interval for p.
c. Why is the interval in (b) wider than the interval in (a)? d. Without doing any additional calculations, test the hypothesis
H0\ p = 0.50 vs. H\ p * 0.50 at the 5% significance level.
3.5 A survey of 1055 registered voters is conducted, and the voters are asked to choose between candidate A and candidate B. Let p denote the fraction of voters in the population who prefer candidate A, and let p denote the fraction of voters in the sample who prefer Candidate A.
fr-l
98 CHAPTER 3 Review of Statistics
/
A- * <S ottl A ^ ; T w ^ r ; v
>^Q t Vol
a. You are interested in the competing hypotheses H0: p = 0.5 vs. //i: p ^ 0.5. Suppose that you decide to reject H0 if \p — 0.5 [ > 0.02.
i. What is the size of this test?
ii. Compute the power of this test if p = 0.53. b. In the survey, p = 0.54.
i. Test H0: p — 0.5 vs. Hj. p =£ 0.5 using a 5% significance level, ii. Test H0: p - 0.5 vs. H\: p > 0.5 using a 5% significance level, iii. Construct a 95% confidence interval for p.
iv. Construct a 99% confidence interval for p. v. Construct a 50% confidence interval for p.
c Suppose that the survey is carried out 20 times, using independently selected voters in each survey. For each of these 20 surveys, a 95 % confidence interval for p is constructed,
i. What is the probability that the true value of p is contained in all 20 of these confidence intervals?
ii. How many of these confidence intervals do you expect to contain the true value of p?
d. In survey jargon, the "margin of error" is 1.96 X SE(p); that is, it is half the length of 95% confidence interval. Suppose you wanted to design a survey that had a margin of error of at most 1 %. That is, you wanted Pr(|p - p1 > 0.01) < 0.05. How large should n be if the sur-vey uses simple random sampling?
Let Yj,.. , Yn be i.i.d.draws from a distribution with mean p. A test of HQ. p. = S^versus / A ^ ^ j V s i n g the usual /-statistic yields a\p-value of 0.03.
a. Does the 95% confidence interval contain p. = 5? Explain.
b. Can you determine ii p. — 6is contained in the 95% confidence inter-you
val? Explain. .«X
n a given population, 11 % of the likely voters are African American. A sur-ey using a simple random sample of 600 landline telephone numbers finds 8% African Americans. Is there evidence that the survey is biased? Explain. 3 $ A new version of the SAT test is given to^lp00o;andomly selected high school seniors. The sample mean test score is 1110, and the sample standard deviation is 123. Construct a 95% confidence interval for the population mean test score for high school seniors.
Exerdses 99 \ 3.9 &•-!* 2eo v>y$as & tp* S 2© ^ ,*P w •K-2^ \
JPVV-Suppose that a lightbulb manufacturing plant produces bulbs with a mean life of 2000 hours and a standard deviation of 200 hours. An inventor claims to have developed an improved process that produces bulbs with a longer mean life and the same standard deviation. The plant manager randomly select^ 100 bulbs\produced by the process. She says that she will believe the inventor's claim if the sample mean life of the bulbs is greater than 2100 hours; otherwise, she will conclude that the new process is no better than the old process. Let p denote the mean of the new process. Consider the null and alternative hypothesis H0: p. — 2000 vs. Hy. p, > 2000.
a. What is the size of the plant manager's testing procedure?
b. Suppose the new process is in fact better and has a mean bulb life of 2150 hours. What is the power of the plant manager's testing
procedure? ^ c p «
i
a. The authors plan to administer the test to all third-grade students in New Jersey. Construct a 95% confidence interval for the mean score of all New Jersey third graders.
•. What testing procedure should the plant manager use if she wants the size of her test to be 5%? ***" J> **"V* *~* - ^ *
' 'XlO4 Suppose a new standardized test is given to 100 randomly selected third-grade students in New Jersey. The sample average score Y on the test is 58 points, and the sample standard deviation, sY, is 8 points.
/t i. \&& - 2 ^ ^ ? ? ^ J b- Suppose the same test is given tq200_|andomly selected third V ~Z ,* £ „ *l* JJ- graders from Iowa, producing a sample average o{62 feoints and x o*. v\~ \,co iro .^ sample standard deviation of 11 points. Construct a 90% confidence
interval for the difference in mean scores between lov^a and New
Jersey. *$»«*>£*
^ , ^fe-^pA ^ ^ ^ - ^ Qan v o u c o n ciude with a high degree of confidence that the popula-*-1LAC(V ' \<?h 4vV* V^- ^ tion means for Iowa and New Jersey students are different? (What is - ffl* • ^ * *& W * -*wji -* t n e sta nd a r d error of the difference in the two sample means? What , A _ =, c^ if - is the p-value of the test of no difference in means versus some
difference?)
3.11 Consider the estimator Y, defined in Equation (3.1). Show that (a) E[Y) - p.Y and (b) var(Y) = 1.25o-y/«.
S - l ^ T o inyestigate^ossible gender discrimination in a firm, a sample oy.00)men and|64wgnieji with similar job descriptions are selected at random. A sum-mary of the resulting monthly salaries follows:
100 CHAPTER 3 Review of Statistics ^M- WX.^vTCcA YL 3CC X...2W30 ^~ 2o> Sv4= ?ao Exercises 101 ^ 3 '5£S>^/ > **^-M„t-Q ^\z*%-r Men Women Average Salary ( V) $3100 $2900 Standard Deviation (sY) $200 $320 n 100 64 ^ $ r .y .
ig^i&3, a. What do these data suggest about wage differences in the firm? Do k>^v they represent statistically significant evidence that average wages of
men and women are different? (To answer this question, first state the null and alternative hypothesis; second, compute the relevant f-statistic; third, compute the p-value associated with the r-statistic; and finally, use the p-value to answer the question.)
b. Do these data suggest that the firm is guilty of gender discrimination
in its compensation policies? Explain.
-3.13 Data on fifth-gr'ade test scores (reading and mathematics) for yVj-v «« »™,i\l~ SChOOl a- "V-^^ districts in California yield Y = 646.2 and standard deviation sY - 1 9 . 5 ^ 4 ^ , ^ ^ , ^ a. Construct a 95% confidence interval for the mean test score in the
population.
b. When the districts were divided into districts with small classes ( < 20 students per teacher) and large classes ( ^ 20 students per teacher), the following results were found:
Class Size Small L>arge v.
Average Score {Y) 657.4 650.0 Standard Deviation (sY) 19.4 17.9 n 238 182 / U s1 ^ ^ 3 M % ™s\& A* v0** r \ l -VJ> &
Is there statistically significant evidence that the districts with smaller classes have higher average test scores? Explain.
3.14 Values of height in inches (Xs) and weight in pounds (Y) are recorded from a sample of 300 male college students. The resulting summary statistics are X = 70.5 in., Y - 158 lb, sx = 1.8 in., sY - 14.2 lb, Sxy = 21.73 in. X lb, and rXY - 0-85. Convert these statistics to the metric system (meters and kilo-grams) .
3.15 Let Ya and Yb denote Bernoulli random variables from two different pop-ulations, denoted a and b. Suppose that E(Ya) — pa and E( Yb) = pb. A ran dom sample of size na is chosen from population a, with sample average denoted pa, and a random sample of size nb is chosen from population b,
s. K '- r st>_ *5> i. ,Afc -^ <e\ -^ 0
to* \^i <\V£i
d
1
^
with sample average denoted pb. Suppose the sample from population a is independent of the sample from population b.
a. Show that E(pa) ~-pa and var(ptt) -~pa(\ - pa)/na. Show that E{pb) -^pb andvar(pb) = ph(l -pb)/nh.
b. Show that var(pfl - p6) = P a ( 1" Pt<) + £ > £ L ^ A ( / f i n r. Remember that the samples are independent.)
c Suppose that naand nb are large. Show that a 95% confidence interval forpfl -pb is given by (pa - pb) ± 1.96./P B ( 1 ~ ^ + Pb^~Pb\
V na 'lb
How would you construct a 90% confidence interval for pa — pbl d. Read the box "A Novel Way to Boost Retirement Savings" in Section
3.5. Let population a denote the "opt-out" (treatment) group and population b denote the "opt-in" (control) group. Construct a 95% confidence interval for the treatment effect, pa - pb.
3.16 Grades on a standardized test are known to have a mean of 1000 for stu-dents in the United States. The test is administered to 453 randomly selected students in Florida;in this sample, themeanis 1013 and the stan-dard deviation (s) is 108. *•> ^
a. Construct a 95% confidence interval for the average test score for Florida students.
b. Is there statistically significant evidence that Florida students perform differently than other students in the United States? V o* y * * ' '" fc c. Another 503 students are selected at random from Florida. They are v.
given a 3-hour preparation course before the test is administered. Their average test score is 1019 with a standard deviation of 95.
i. Construct a 95% confidence interval for the change in average test score associated with the prep course.
ii. Is there statistically significant evidence that the prep course helped?
d. The original 453 students are given the prep course and then are asked to take the test a second time. The average change in their test scores is 9 points, and the standard deviation of the change is 60 points.
i. Construct a 95% confidence interval for the change in average test scores. <s^£* * %
-3i X' V ^
102 __
9 "'° ii. Is there statistically significant evidence that students will perform CHAPTER 3 Review of Statistics
+•-—l *— -- s \°\
better on their second attempt after taking the prep course? CCVWT&V,- Vr.ve -} A^w-w. *~(a pcxotf jjj Students may have performed better in their second attempt
w^« - ^ \o.W Vv \ v * ^ ' < w v. because of the prep course or because they gained test-taking experience in their first attempt. Describe an experiment that would quantify these two effects.
3.17 Read the box "The Gender Gap of Earnings of College Graduates in the , . 24 3 L : 2i ^ L ^ - , A •<&, United States" in Section 3.5.
a. Construct a! 95% confidence interval :or the change in men's average V
3 £
-S fv- •0 w -Ou
t V
hourly earnings between 1992 and 2008.
b. Construct a 95% confidence interval for the change in women's aver-age hourly earnings between 1992 and 2008.
c Construct a 95% confidence interval for the change in the gender gap in average hourly earnings between 1992 and 2008. (Hint: ym,\<m - %.vm is independent of Ym,2oos ~ Xv,2oos-)
3.18 This exercise shows that the sample variance is an unbiased estimator of the population variance when Yh..., Yn are i.i.d. with mean p.Y and vari-ance <Jy.
a. UseEquation(2.31)toshowthat£[(Y,-Y)2]-var(Yf) 2cov(Y(,Y) + var(Y).
b. Use Equation (2.33) to show that cov( Y, Y{) ~- <ry/n. c. Use the results in (a) and (b) to show that E(sY) = cry.
is an unbiased estimator of p,Y. Is Y2 an unbiased estimator of fiyl 2.9 Y is a consistent estimator of p,Y. Is Y2 a consistent estimator of fxYl r . , u . \ i z 3.20 Suppose that {Xu YA are i.i.d. with finite fourth moments. Prove that the
sample covariance is a consistent estimator of the population covariance. that is, Sxy -E-^ (TXY> where SXY is defined in Equation (3.24). (Hint: Use the strategy of Appendix 3.3 and the Cauchy-Schwartz inequality.)
3.21 Show that the pooled standard error [SEpooled( Ym - Yw)] given following Equation (3.23) equals the usual standard error for the difference in means in Equation (3.19) when the two group sizes are the same (nm - nw).
Empirical Exercise 103 Empirical Exercise
E3.1 -Qirthe text Web site http://www.pearsonhighered.com/stock_watson/ you will find a data file CPS92_08 that contains an extended version of the dataset used in Table 3.1 of the text for the years 1992 and 2008. It contains data on full-time, full-year workers, age 25-34, with a high school diploma or B.A./B.S. as their highest degree. A detailed description is given in CPS92_08 .Description, available on the Web site. Use these data to answer the following questions.
a. Compute the sample mean for average hourly earnings (AHE) in 1992 and in 2008. Construct a 95% confidence interval for the population means of AHE in 1992 and 2008 and the change between 1992 and 2008. b. In 2008, the value of the Consumer Price Index (CPI) was 215.2. In
1992, the value of the CPI was 140.3. Repeat (a) but use A H E mea-sured in real 2008 dollars ($2008); that is, adjust the 1992 data for the price inflation that occurred between 1992 and 2008.
c. If you were interested in the change in workers' purchasing power from 1992 to 2008, would you use the results from (a) or from (b)? Explain.
d. Use the 2008 data to construct a 95% confidence interval for the mean of AHE for high school graduates. Construct a 95% confidence interval for the mean of A H E for workers with a college degree. Con-struct a 95% confidence interval for the difference between the two means.
e. Repeat (d) using the 1992 data expressed in $2008.
f. Did real (inflation-adjusted) wages of high school graduates increase from 1992 to 2008? Explain. Did real wages of college graduates increase? Did the gap between earnings of college and high school graduates increase? Explain, using appropriate estimates, confidence intervals, and test statistics.
g. Table 3.1 presents information on the gender gap for college gradu-ates. Prepare a similar table for high school graduates using the 1992 and 2008 data. Are there any notable differences between the results for high school and college graduates?
132 CHAPTER A Linear Regression with One Regressor
2. The population regression line can be estimated using sample observations (Yh Xt), i - 1,..., n by ordinary least squares (OLS).The OLS estimators of the regression intercept and slope are denoted fa and fa.
3. The R2 and standard error of the regression (SER) are measures of how close the values of Y; are to the estimated regression line. The R2 is between 0 and 1, with a larger value indicating that the Y/s are closer to the line. The standard error of the regression is an estimator of the standard deviation of the regression error.
4. There are three key assumptions for the linear regression model: (1) The regression errors. ut, have a mean of zero conditional on the regressors X;% (2) the sample observations are i.i.d. random draws from the population; and (3) large outliers are unlikely. If these assumptions hold, the OLS estimators fa and fa are (1) unbiased, (2) consistent, and (3) normally distributed when the sample is large.
Key T e r m s
linear regression model with a single regressor (110)
dependent variable (110) independent variable (110) regressor (110)
population regression line (110) population regression function (110) population intercept (110)
population slope (110) population coefficients (110) parameters (110)
error term (110)
ordinary least squares (OLS) estimators (114)
R e v i e w t h e C o n c e p t s
4.J Explain the difference between fa and fa; between the residual ut and the regression error «,; and between the OLS predicted value If and E(Yt\Xi). 4.2 For each least squares assumption, proyide an example in which the assumption is valid, then provide an example in which the assumption fails.
OLS regression line (114) sample regression line (114) sample regression function (114) predicted value (114)
residual (115) regression^?2 (119)
explained sum of squares (ESS) (119) total sum of squares (TSS) (119) sum of squared residuals (SSR) (120) standard error of the regression (SER)
(120)
least squares assumptions (122)
Exercises 133
•a. TV
4.3 Sketch a hypothetical scatterplot of data for an estimated regression with R2 = 0.9. Sketch a hypothetical scatterplot of data for a regression with R2 = 0.5.
/ Exercises
y
4.1 Suppose that a researcher, using data on class size (C5) and average test scores from 100 third-grade classes, estimates the OLS regression
Jk. f
TestScore - 520.4 - 5.82 X CS, R2 = 0.08, SER = 11.5. a. A classroom has 22 students. What is thejegression's prediction for
that classroom's average test score? -Vn^Suf = _3*-«3 ^ - s V i * i b. Last year a classroom had 19 students, and this year it has 23 students.
What is the regression's predict! qn_for,the .change in the classroom average test score? A • -*5 .*% 1 \~JW "^ *
c. The sample average class size across the 100 classrooms is 21.4. What 2.\ A is the sample average of the test scores across the 100 classrooms?
(Hint: Review the formulas for the OLS estimators.)
S £ & * - J - - - - ^^56. St£{>-"Td. What is the sample standard deviation of test scores across the LOO classrooms? (Hint: Review the formulas for the R2 and SER.)
—•^
Suppose that a random sample of 200twenty-year-old men is selected from a population and that these men's height and weight are recorded. A regres-t*y sion of weight on height yields |
£.,= xr- \ « ^ \ Y*£>> j— >2 . Ikl ' _ i ^ Weight - -99.41 +- 3.94 x Height. R l -- 0.81, SER •= 10.2, ,t.»^v ~°£\ AN * ^ ^ ' • • ---"- -where Weight is measured in pounds and Height is measured in inches.
\ ^ ^ , ^ A -ft. yjtf \5 a- What is the regression's weight prediction for someone who is 70 in. tall? 65 in. tall? 74 in. tall?
i - ,A*iS- (,_ ^ m a n ^yg a ia t e growth spurt and grows 1.5 in. over the course of a year. •* ' % s-oxxw \*t. ' U J J . ^ ^ate What is the regression's prediction for the increase in this man's weight?
_ M '*•* r *^ c. Suppose that instead of measuring weight and height in pounds and inches these variables are measured in centimeters and kilograms." What are the regression estimates from this new ceptimeter-kilogram regression? (Give all results,
WW- > ^ * o V S )
^
mmius nuiii una new L,cjrtJ*uciE;i-iviiugiuui s, estimated coefficients, Ryand SER.) .
; • & * * ^ t?
A-^
!
- —1 134 . CHAPTER A Linear Regression with One Regressor
1% - G?tfo n ~ - ^ y * S * ^ &iN€v*ftv**A o ^ o W - KstA<£ v W M ^ - - , ^ 4.3 A regression of average weekly earnings (AWE,measured in pollarsjon p, -".(e —H ° ^ - <S "& ^ age (measured in yearsj_using a random sample of college-educated
full-&<$ * ^ * \ ^ towOA^^ V j ^ t ti m e w o r k e r s a g e d 25-65 yields the following: SE*» V ^
-^ ' ^ ^ Of « * ^ f ^ e _ _ " '*
KS £ f c - J ^ - [ i " g i ~ ,4 W£ = 696.7 1- 9.^ X Age, R2 - 0.023, SER = 624.1.
« • * ~ V ^ f c n * ^ A S = \ C T J ] A 45-year-old worker?
€ W° Sa-^s^j ^ ^ e , ^ ^ V e t v x s e* Will the regression give reliable predictions for a 99-year-old worker? ^ ^ * ^ • <*\ ^ tsuAV^ * * W hV o r w hy n o t ? tf: O ^ ^ g g ^ ^ - ^
"^v y ™ & 3 * ^ i cipSCx ?* f. Given what you know about thetjt^Kbution of eanungg^o you think 0«*M V^ ^f*^ $ & * ^ ^ ijrt&f\ it is plausible that the distribution of errofsMfftne'regression is nor-c a ^ J o V nor-cio- ***i'*Je*-r ^ ^x & J , K^rmal? (Hint: Do you think that the distribution is symmetric or r . , ,. \ i , ,_ skewed? What is the smallest value of earnings, and is it consistent \ . 5 W 6 > < g k V*. ,AO* with a normal distribution?) ^ - ^ . ^ ^ ^ S O n
v.«-^\axi wJ^cw«i (\\>J^)p^-*aV<>irt 8* The average age in this sample is 41.6 years. What is the average value <&<S&& oc AUKS of A WE in the sample? (Hint: Review Key Concept 4.2.)
a. Explain what the coefficient values 696.7 and 9.6 mean.
b. The standard error of the regression (SER) is 624,1. What are the units of measurement for the SER1 (Dollars? Years? Or is SER unit-free?) c. The regression R2 is 0.023. What are the units of measurement for the
R21 (Dollars? Years? Or is R2 unit-free?)
d. What is the regression's predicted earnings for a 25-year-old worker?
4.4 Read the box "The 'Beta' of a Stock" in Section 4.2.
a. Suppose that the value of {3 is greater than 1 for a particular stock. Show that the variance of (R - Rf) for this stock is greater than the variance of (Rm - R,).
b. Suppose that the value of /? is less than 1 for a particular stock. Is it possible that variance of (R Rf) for this stock is greater than the variance of (Rm - /?,)? (Hint: Don't forget the regression error.) c. In a given year, the rate of return on 3-month Treasury bills is 3.5%
and the rate of return on a large diversified portfolio of stocks (the S&P 500) is 7.3%. For each company listed in the table in the box, use the estimated value of p to estimate the stock's expected rate of return. 4.5 A professor decides to run an experiment to measure the effect of time
pressure on final exam scores. He gives each of the 400 studentsjn his course the same final exam, but some students have 90 minutes to complete the exam while others have 120 minutes. Each student is randomly assigned
< = * o ^ Exercises 135
; ^ w ^ . ^
Explain why E(ui\Xt) = 0 for this regression model. Are the other assumptions in Key Concept 4. d. The estimated regression is >; - 49 + 0.24 Xj.
i. Compute the estimated regression's prediction for the average Q<v. J^v*d s^*v> y ~j one ot the examination times based on the flip of a coin. Let Yt denote the
• number of points scored on the exam by the Ith student (0 s Yt ^ 100), (biOto^ Q«>C^ei**A u s let Xt denote the amount of time that the student has to complete the exam
4 A , . L\\ t e ^ ^ ^ S {X; = 90 or 120), and consider the regression model Yt = fa + faX{ + uh £MV^ w ^ o ^ Explain what the term «,- represents. Why will different students have ' l ^ c ^ V & f c v t f ^ ^ ^ V ^ \ different values of u{l
^rfwfc ^ o r 6 S ote w & f c ^ G**^ c# ^ r e t n e other assumptions in Key Concept 4.3 satisfied? Explain.
score of students given 90 minutes to complete the exam. Repeat for 120 minutes and 150 minutes.
ii. Compute the estimated gain in score for a student who is given an , _ . additional 10 minutes on the exam.
t £* '" ^ v Wl A i 4.6 Show that the first least squares assumption, E(ui Xt) - 0, implies that B ^ 1 1 fa ft^^Ztt^jji^ = f}Q + plXh
4.7 Show that fa is an unbiased estimator of fa. (Hint: Use the fact that £: is unbiased, which is shown in Appendix 4.3.)
4.8 Suppose that all of the regression assumptions in Key Concept 4.3 are sat-isfied except that the first assumption is replaced with E(ut Xt) — 2. Which parts of Key Concept 4.4 continue to hold? Which change? Why? (Is £j normally distributed in large samples with mean and variance given in Key Concept 4.4? What about £0?)
4.9 a. A linear regression yields fi\ s 0. Show that R~ - 0.
b. A linear regression yields R2 - 0. Does this imply that fa - 0?
£H^ Suppose that Yj = fa + faXt t-«,-, where (A), u{) are i.i.d., and X{ is a &
-% *
JA
o
(
7 7 ^ , ^ (Bernoulli random variable with Vr(X = 1) = 0.20. When X ^ 1. w, is N(0,4);
a. Show that the regression assumptions in Key Concept 4.3 are satisfied. b. Derive an expression for the large-sample variance of j81. [Hint:
Eval-uate/the terms in Equation (4.21).]
4.11 Consider the regression model Yj = fa + p^X, -*- ut.
a. Suppose you know that fa — 0. Derive a formula for the least squares estimator of j3>. ^L* g_„. z _^
136 CHAPTER A Linear Regression with One Regressor Empirical Exercises 137 b. Suppose you know that fa = 4. Derive a formula for the least squares
estimator of Bv
4.12 a. Show that the regression R2 in the regression of Y on X is the squared value of the sample correlation between X and Y. That is, show that R2-r2XY.
b. Show that the R2 from the regression of Y on X is the same as the R2 from the regression of X on Y
c. Show that fa = rXy(sY/sx), where rXY is the sample correlation between X and Y, and sY and sx are the sample standard deviations of X and Y. 4.13 Suppose that 1,' = fa + faX, + KUi} where K is a non-zero constant and (Yh X{)
satisfy the three least squares assumptions. Show that the large sample » , -, var[(X. - /0".
variance of /3i is given by crx — K « • " ^ -.[Hint: This equation is the variance given in equation (4.21) multiplied by jr.]
4.14 Show that the sample regression line passes through the point (X,Y).
Empirical Exercises
E4.1 On the text Web site http://www.pearsonhighered.com/stock_watson/, you will find a data file CPS08 that contains an extended version of the data set used in Table 3.1 for 2008. It contains data for full-time, full-year workers, age 25-34, with a high school diploma or B.A./B.S. as their highest degree. A detailed description is given in CPS08_Description, also available on the Web site. (These are the same data as in CPS92_08 but are limited to the year 2008.) In this exercise, you will investigate the relationship between a worker's age and earnings. (Generally, older workers have more job expe-rience, leading to higher productivity and earnings.)
a. Run a regression of average hourly earnings (AHE) on age (Age). What is the estimated intercept? What is the estimated slope? Use the estimated regression to answer this question: How much do earnings increase as workers age by 1 year?
b. Bob is a 26-year-old worker Predict Bob's earnings using the esti-mated regression. Alexis is a 30-year-old worker. Predict Alexis's earnings using the estimated regression.
c. Does age account for a large fraction of the variance in earnings across individuals? Explain.
K4.2 On the text Web site http://www.pearsonhighered.com/stock_watson/, you will find a data file TeachingRatings that contains data on course evalua-tions, course characteristics, and professor characteristics for 463 courses at the University of Texas at Austin.1 A detailed description is given in TeachingRatings ^Description, also available on the Web site. One of the characteristics is an index of the professor's "beauty" as rated by a panel of six judges. In this exercise, you will investigate how course evaluations are related to the professor's beauty.
a. Construct a scatterplot of average course evaluations (Course_Eval) on the professor's beauty (Beauty). Does there appear to be a rela-tionship between the variables?
b. Run a regression of average course evaluations (Course _Eval) on the professor's beauty (Beauty). What is the estimated intercept? What is the estimated slope? Explain why the estimated intercept is equal to the sample mean of Course_Eval. (Hint: What is the sample mean of Beauty!)
c. Professor Watson has an average value of Beauty, while Professor Stock's value of Beauty is one standard deviation above the average. Predict Professor Stock's and Professor Watson's course evaluations. d. Comment on the size of the regression's slope. Is the estimated effect
of Beauty on Course_Eval large or small? Explain what you mean by "large" and "small."
e. Does Beauty explain a large fraction of the variance in evaluations , across courses? Explain.
E4.3 On tne text Web site http://www.pearsonhighered.com/stock_watson/, you will find a data file CollegeDistance that contains data from a random sam-ple of high school seniors interviewed in 1980 and re-interviewed in 1986. In this exercise, you will use these data to investigate the relationship between the number of completed years of education for young adults and the distance from each student's high school to the nearest four-year col-lege. (Pr6ximity to college lowers the cost of education, so that students who-live closer to a four-year college should, on average, complete
1 These data were provided by Professor Daniel Hamermesh of the University of Texas at Austin and were used in his paper with Amy Parker, "Beauty in the Classroom: Instructors' Pulchritude and Puta-tive Pedagogical Productivity," Economics of Education Review, August 2005.24(4): 369 -376.
138 CHAPTER 4 Linear Regression with One Regressor Derivation of the OLS Estimators 139
E4.4
more years of higher education.) A detailed description is given in College Distance ^Description, also available on the Web site.2
a. Run a regression of years of completed education (ED) on distance to the nearest college (Dist), where Dist is measured in tens of miles. (For example, Dist = 2 means that the distance is 20 miles.) What is the estimated intercept? What is the estimated slope? Use the esti-mated regression to answer this question: How does the average value of years of completed schooling change when colleges are built close to where students go to high school?
b. Bob's high school was 20 miles from the nearest college. Predict Bob's years of completed education using the estimated regression. How would the prediction change if Bob lived 10 miles from the nearest college?
c. Does distance to college explain a large fraction of the variance in educational attainment across individuals? Explain.
d. What is the value of the standard error of the regression? What are the units for the standard error (meters, grams, years, dollars, cents, or something else)?
On the text Web site http://www.pearsonhighered.com/stock watson/, you will find a data file Growth that contains data on average growth rates from 1960 through 1995 for 65 countries along with variables that are potentially related to growth. A detailed description is given in Growth,Description, also available on the Web site. In this exercise, you will investigate the rela-tionship between growth and trade.3
a. Construct a scatterplot of average annual growth rate (Growth) on the average trade share (TradeShare). Does there appear to be a rela-tionship between the variables?
b. One country, Malta, has a trade share much larger than the other coun-tries. Find Malta on the scatterplot. Does Malta look like an outlier? c. Using all observations, run a regression of Growth on TradeShare.
What is the estimated slope? What is the estimated intercept? Use the : These data were piovided by Professor Cecilia Rouse of Princeton University and were used in her paper "Democratization or Diversion? The Effect of Community Colleges on Educational Attain-ment," Journal of Business and Economic Statistics, April 1995,12(2): 217-224.
3 rhese data were provided by Professor Ross Levine of Brown University and were used in his paper with Thorsten Beck and Norman Ix>ayza, "Finance and the Sources of Growth." Journal of Financial Economics, 2000,58:26! -300.
regression to predict the growth rate for a country with a trade share of 0.5 and with a trade share equal to 1.0.
d. Estimate the same regression excluding the data from Malta. Answer the same questions in c..
e. Where is Malta? Why is the Malta trade share so large? Should Malta be included or excluded from the analysis?
4 . 1 T h e C a l i f o r n i a T e s t S c o r e D a t a S e t
The California Standardized Testing and Reporting data set contains data on test perfor-mance, school characteristics, and student demographic backgrounds. The data used here are from all 420 K-6 and K-8 districts in California with data available for 1999. Test scores are the average of the reading and math scores on the Stanford 9 Achievement Test, a stan-dardized test administered to fifth-grade students. School characteristics (averaged across the district) include enrollment, number of teachers (measured as "full-time equivalents"), number of computers per classroom, and expenditures per student. The student-teacher ratio used here is the number of students in the district divided by the number of full-time equivalent teachers. Demographic variables for the students also are averaged across the district. The demographic variables include the percentage of students who are in the pub-lic assistance program CalWorks (formerly AFDC), the percentage of students who qual-ify for a reduced price lunch, and the percentage of students who are English learners (that is, students for whom English is a second language). All of these data were obtained from the California Department of Education (www.cde.ca.gov).
4 . 2 D e r i v a t i o n , ^ t h e O L S E s t i m a t o r s
This appendixWs calculus to derive the formulas for the OLS estimators given in Key Concept 4.2. To minimize the sum of squared prediction mistakes X"=i(X^^u - b\KY [Equation (4.6)]. first take the partial derivatives with respect to b0 and b\.
'~^J(Yi-^-biXi)2--2%(Yr fco-MQand
ob0 £1 ,=i
-r'kiY^bu 'b.Xi)2—22(X-bu-blXi)X,
<fbi £i ,-=j
(4.2?) (4.24)
168 CHAPTER 5 Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals { M -1& ^h? coefficient multiplying Dt (154) coefficient on D-t (154) heteroskedasticity and homoskedasticity (156) homoskedasticity-only standard errors (158) heteroskedasticity-robust standard error(159) R e v i e w t h e C o n c e p t s Gauss-Markov theorem (162) best linear unbiased estimator
(BLUE) (163)
weighted least squares (163) homoskedastic normal regression
assumptions (164)
Gauss--Markov conditions (176)
5.1 Outline the procedures for computing the p-value of a two-sided test of H0: p,y = 0 using an i.i.d. set of observations Yh i = 1 , . . ., n. Outline the pro-cedures for computing the p-value of a two-sided test of HQ: fa = 0 in a regression model using an i.i.d. set of observations (Yh X;), i\~ 1 , . . . , n. •$ 5.2 Explain how you could use a regression model to estimate the wage
gen-der gap using the data on earnings of men and women. What are the depen-dent and independepen-dent variables?
5.3 Define homoskedasticity and heteroskedasticity. Provide a hypothetical empirical example in which you think the errors would be heteroskedastic and explain your reasoning.
Exercises
5.1 Suppose that a researcher, using data on class size (CS) and average test scores from 100 third-grade classes, estimates the OLS regression
- 7 <o*3 6^ ?? TestScore = 520.4 5.82 X CS, R2 ~- 0.08, SER - 11.5. (20.4) (2.21) 'V-rfj ^ Exercises 169 ' " A - - •
\ c. Calculate the p-value for the two-sided test of the null hypothesis \ . ^ H0: fa = -5.6. Without doing any additional calculations, determine - ~ty^ •** whether -5.6 is contained in the 95% confidence interval for Bv.
yd. Construct a 99% confidence interval for fa.
'Xj Suppose that a researcher, using wage data on 250 randomly selected male v o^>, 3° workers and 280 female workers, estimates the OLS regression
X / Wage = 12.52 + 2.12 X Male, R2 -- 0.06, SER = 4.2, (.23) (0.36) Y l -ft £ - o ^ - - -?-\j,_ •S<o :&> - V « f \ { fit-***'* n-5 Z -- \A i<A - t *¥> * Cy -£ oS ^ c *
a. Construct a 95% confidence interval for fa, the regression slope
coef-ficient. l %
b. Calculate the p-value for the two-sided test of the null hypothesis H0: & = 0. Do you reject the null hypotKesis at the_5% level? At the
1% level? \ ,o©r <^y> W<. \
/ ' V
Os» 7 5 *
where Wage is measured in dollars per hour and Male is a binary variable that is equal to 1 if the person is a male and 0 if the person is a female. Define the wage gender gap as the difference in mean earnings between men and women.
a. What is the estimated gender gap? i \L X ', W
b. Is the estimated gender gap significantly different from zero? (Com-pute lhe/?-value for testmg the null hypothesis that there is no gender
c. Construct a 95% confidence interval for the gender gap.
d. In the sample, what is the mean wage of women? Of men? \l ,S~k } \*\ e. Another researcher uses these same data but regresses Wages on
Female, a variable that is equal to 1 if the person is female and 0 if the person a inale. What are the regression estimates calculated from this regression?
* Wage =--£.^2- -J-2 AT- X Female, R2 = - ^ , SER ~ f.1 . 4~
5.3-*\Suppose that a random sample of 200 twenty-year-old men is selected from - ~S & population and their heights and weights arerecorded. A regression of
weight on height yields
Weight = -99.41 f 3.94 X Height, ti2 = 0.81, SER = 10.2, (2.15) (0.31) \ . s t ^ * ? - H ^
J t
^ r where Weight is measured in pounds and Height is measured in inches. A man has a late growth spurt and grows 1.5 inches over the course of a year. Construct a 9^% confidence interval for the person's weight gain.
3 * v < o - y \ ^
'-A
170 CHAPTER 5 Regression with a Single Regressor Hypothesis Tests and Confidence intervals Exercises 171
v j
5t
5.4 Read the box "The Economic Value of a Year of Education: Homoskedas-ticity or HeteroskedasHomoskedas-ticity?" in Section 5.4. Use the regression reported in Equation (5.23) to answer the following.
a. A randomly selected 30-year-old worker reports an education level of 16 years. What is the worker's expected average hourly earnings? b. A high school graduate (12 years of education) is contemplating
going to a community college for a 2-year degree. How much is this worker's average hourly earnings expected to increase?
c. A high school counselor tells a student that, on average, college grad uates earn $10 per hour more than high school graduates. Is this state-ment consistent with the regression evidence? What range of values is consistent with the regression evidence?
5.5Xi In the 1980s, Tennessee conducted an experiment in which kindergarten stu-dents were randomly assigned to "regular" and "small" classes, and given standardized tests at the end of the year. (Regular classes contained approx-imately 24 students, and small classes contained approxapprox-imately 15 students.) Suppose that, in the population, the standardized tests have a mean score of 925 points and a standard deviation of 75 points. Let SmallClass denote a binary variable equal to 1 if the student is assigned to a small class and equal to 0 otherwise. A regression of TestScore on SmallClass yields
TestScore - 918.0 + 13.9 X SmallClass, R- - 0 0 1 , SER (1.6) (2.5) 74.6. ^ t \ r - %s* \ 5 ^ f 2 ^ ( : y
a. Do small classes improve test scores? By how much? Is the effect large? Explain. ^ - 2 ^ ° ^ ^ ° w >«l£ Su*»-2> KW>' b. Is the estimated effect of class size on test scores statistically
signifi-cant? Carry out a test at the 5% level. 5 ^ © •*&. r « t u \ ^ c. Construct a 99% confidence interval for the effect of SmallClass on
test score.
5.6" \ Refer to the regression described in Exercise 5.5.
a. Do you think that the regression errors plausibly are homoskedastic? Explain. ^ ^ ^ ^ p r^v VcjVW ^ ^ - A ^ W ^0^ $ S ; , o ^ b. SE(fa) was computed using Equation (5.3). Suppose that the regres-sion errors were homoskedastic: Would this affect the Validity of the confidence interval constructed in Exercise 5.5(c)? Explain.
<y>
\ 1 U A i 4 l
r 5.7 ) Suppose that (Y„ Xf) satisfy the assumptions in Key Concept 4.3. A random sample of size n = 250 is drawn and yields ^J
Y - 5.4 f 3.2X, R2 = 0.26, SER -= 6.2. (3.1) (1.5)
a. Test H0: fa = 0 vs.H< fa^Qat the 5% level. b. Construct a 95% confidence interval for fa.
c Suppose you learned that Yj and X{ were independent. Would you be surprised? Explain.
d. Suppose that Yi and Xi are independent and many samples of size n = 250 are drawn, regressions estimated, and (a) and (b) answered.
In what fraction of the samples would H0 from .(a) be rejected? In what fraction of samples would the value B: — 0 be included in the confidence interval from (b)?
Suppose that (Yh %) satisfy the assumptions in Key Concept 4.3 and, in addition, u( is N(0, a2) and is independent of X,. A sample of size n - 30
yields ^
Y = 43.2 + 61.5JT, R2 = 0.54, SER = 1.52, (10.2) (7.4)
^wKeYelhe numbers in parentheses are the homoskedastic-onlv standard errors fof the regression coefficients.
a. Construct a 95% confidence interval for fa.
b. Test H0: fa = 55 vs. H^. fa * 55 at the 5% l e v e l . ^ ^ c. Test //„: fa - 55 vs. Hx: fa > 55 at the 5% level. ^ " ^ 5.9 Consider the regression model
where u{ and X{ satisfy the assumptions in Key Concept 4.3. Let ~$ denote an estimator of /? that is constructed as 0 - YjA\ where Y and X are the sample means of Yt and X„ respectively.
a. Show that 0 is a linear function of Yh Y2)..., Y„. b. Show that 0 is conditionally unbiased.