• No results found

ChIIIEstimationandTestingintheSimpleRegressionModel.pptx

N/A
N/A
Protected

Academic year: 2020

Share "ChIIIEstimationandTestingintheSimpleRegressionModel.pptx"

Copied!
52
0
0

Loading.... (view fulltext now)

Full text

(1)

III. Estimation and Testing in Simple Regression

a. Estimation in the Simple Linear Regression Model b. Sampling Distribution of b1

c. Understanding Standard Errors d. Confidence Intervals

e. Example: Log-Log Price Elasticity Regressions f. Hypothesis-testing

(2)

a. Estimation in the Simple Linear Regression Model

Recall the SLR that assumes that every observation in the dataset was generated by the model:

We use Least Squares to estimate β0 and β1 . Recall the formulas:

ˆ

β

1

= b

1

=

i=1

(X

i

X)(Y

i

Y)

N

(X

i

X)

2 i=1

N

ˆ

β

0

= b

0

= Y

b

1

X

(3)

a. Estimation in the Simple Linear Regression Model

NOTE!!: β0 is not b0, β1 is not b1 and εi is not e

Y

True Line: β0 + β1 X

Least Squares Line: b0 + b1 X

ei

(4)

large 

small 

b. Sampling Distribution of b1

It is possible to derive the sampling distribution of b1. See appendix 2, b1 is a weighted average of the Y values!

This distribution describes how the estimator b1 would vary over different samples with the X values fixed.

It turns out that b1 is normally distributed

Mean is β1 -- unbiased Variance of b1

The variance term determines how

close the estimate will be to the true value. Remember: large σ is bad!

b

1

~ N

β

1

,

σ

b

1

2

(5)

b. Sampling Distribution of b1

What is the formula for ? Can we intuit what should be in the formula? (see appendix 2 for derivation).

How should σ figure in the formula?How should N figure in the formula?Anything else?

three factors:

N

σ2

– sX

σb

1

2

Var b

( ) =

1

σ

2

X

i

X

(

)

i=1 N

2

=

σ

2

N

1

(6)

b. Sampling Distribution of b1

sX

N σ

(7)

c. Understanding Standard Errors

When estimating a quantity, it is vital to develop a notion of the precision of the estimation.

examples:

i. estimate the slope of the regression line

ii. estimate the value of a flat-panel TV given its size iii. estimate the expected return on a portfolio

iv. estimate the value of a brand name

v. estimate the damages from patent infringement

Why is this important?

We plan on making business decisions based on our

estimates.

Some decisions may be very sensitive to the

(8)

c. Understanding Standard Errrors An example from “everyday” life:

When framing a house, we can estimate

a required piece of wood to ± ¼”

When building a fine cabinet, the

estimates may have to be accurate to ±1/16” or even ±1/32”

The standard deviations of the least squares estimators of the slope and intercept give a precise measurement of the accuracy of the estimator.

(9)

c. Understanding Standard Errors

If we insert our estimate of σ, then we have estimated standard deviations or standard errors for the least squares estimators:

Now we can summarize the amount of information there is in the sample about the true regression line parameters.

sb

1 =

2

(N−1)2X

S not σ

Bottom Line:

(10)

c. Understanding Standard Errors

Where can we find the standard errors on the R printout?

We would like to translate the size of the standard errors into probability statements about the likely ranges of true β values.

0

b

s

1

b

(11)

d. Confidence Intervals

We want a margin of error in the estimation of the slope. We can use the standard errors to construct a confidence interval which provides the margin of error.

All confidence intervals are of the form:

t* is a positive number obtained from the t distribution.

So we have the estimate +/- a multiple of the standard error.

b

1

± t

*

b
(12)

d. Confidence Intervals

To define a confidence interval, you must first set the confidence level.

We can never be completely confident that an interval will cover the true value. (the 100% confidence interval is

everything!). So we set a confidence level. Typically, 95 per cent is used (for very large datasets a 99 per cent level

should be used).

We then determine the multiple, t*, so that there is a 95 per

(13)

d. Confidence Intervals: finding t*

We find t* by reference to the t distribution. The t distribution is similar to the standard normal (Z) and indexed by the number of degrees of freedom, N-2. For a 100*(1-α)% confidence level:

-4 -2 0 2 4

0

.0

0

.1

0

.2

0

.3

0

.4

t

*

N−2,α/ 2

(14)

d. t Distribution and Confidence Intervals

Thus, the 100 x (1-α)% C.I. is given by:

Confidence intervals provide information about the range of values of the slope consistent with our data. This is much more useful than simply using the slope estimate.

An estimate without some idea of its precision is useless.

The only question is how to find (“look-up”) t*

b

1

± t

N*2,α/ 2

b
(15)

d. t Distribution and Confidence Intervals

Let’s compute a confidence interval for the flat-panel TV data

First we choose α, Pick α = 0.05 or 95 per cent level of confidence.

95% CI: 57.13 ± t68,0.025 (6.555)

Finding the t* cut-off value

We need to find the value, t*, such that

or

This should remind us of the CDF function (Cumulative Distribution Function) …

Pr ⎡⎣−tN*2,α/ 2 ≤X ≤tN*2,α/ 2 ⎤⎦= 1−α where X ~ tN2

Pr X⎡⎣ ≤−tN*2,α/ 2 ⎤⎦= α

(16)

Quick Review: CDF

The CDF is a table of probabilities that tells us: for any little x, what is the probability that the random variable X is less than x?

Plotting this “table”…

X

x

Pr

)

x

(

f

=

-4 -2 0 2 4

0 .0 0 .2 0 .4 0 .6 0 .8 1 .0 X Probability

(17)

Quick Review: CDF

Let’s blow up the boxed area:

-3.0 -2.5 -2.0 -1.5 -1.0

0

.0

0

.0

5

0

.1

0

0

.1

5

0.025

t68,.025

Note that we are using the CDF function or “table” backwards. We are reading from Probability to Value.

In fact, we are using the “Inverse” CDF function

CDF: Value Probability

(18)

Quick Review: CDF

Let’s do it in R. We use the qt() function which is the

“quantile” or inverse CDF function. We need to tell R which t distribution to use (the one with 68 df) and feed in α/2.

So our confidence interval is

57.13 ± 1.995(6.555) = [44.05, 70.21]

(19)

e. Example: Log-Log Price Elasticity Regressions

Let’s look at some demand data. The dataset detergent has demand data on sales and prices of Tide 128 oz laundry

detergent at some 86 stores with 2-5 years of weekly data.

Hard to see much of

anything here.

(20)

e. Example: Log-Log Price Elasticity Regressions

Quantity is an odd variable. It can’t be less than zero and is often very small. There appear to be store-weeks where quantity is huge. Are these outliers?

Recall that the logarithm function has the effect of

compressing large values and expanding the axis for small values.

Basic Properties:

log 1

( ) = 0

log z × w

(

) = log z

( ) + log w

( )

log z

w
(21)
(22)

e. Example: Log-Log Price Elasticity Regressions

(23)

e. Example: Log-Log Price Elasticity Regressions Run regressions with log and non-logged variables.

(24)

e. Example: Elasticities

In the regression with raw variables, we interpret regression coefficient as the expected change in q for a given change in p.

E q p

⎡⎣ ⎤⎦

= 662

69.4p

Δ

q

Δ

p

=

69.4

How do we interpret this? Not very meaningful without a

(25)

e. Example: Elasticities

To interpret the coefficient, some would convert it into an elasticity.

%

Δ

q

%

Δ

p

=

Δ

q

q

Δ

p

p

=

Δ

q

Δ

p

×

p

q

=

69.4

8.36

81

=

7.16

Here we used the average levels of price and quantity.

A one percent reduction in price yields a 7.16 percent

(26)

e. Example: Elasticities

In the log-log regression, the coefficient on log-price can be interpreted directly as an elasticity. Why?

Δ

logq

Δlogp

= −4.4 =

log q

( ) − log q

1

( )

0

log p

( ) − log p

1

( )

0

=

log

q

1

q

0

⎝⎜

⎠⎟

log

p

1

p

0

⎝⎜

⎠⎟

=

log 1+

Δq

q

0

⎝⎜

⎠⎟

(27)

e. Example: Log-log Demand

Let’s compute a confidence interval for the price elasticity. Since the sample size is very large (> 14,000), we will use a 99 percent confidence level.

Good to be a marketer with ample and informative data.

(28)

f. Hypothesis Testing

Suppose that we are interested in a specific value of the slope parameter, β1.

This can be rephrased as a hypothesis

H0: Null (from “no effect”) vs.

HA : Alternative

For example, is there any evidence in the data to support the existence of a relationship between X and Y?

So if we want test whether X affects Y, we would test whether β1 = 0.

* 1 1

β

β

=

(29)

f. Hypothesis Testing

How can we assess whether or not the data support or refute the null hypothesis?

We can look at our estimate of the true slope and compare it to the hypothesized value:

b

1

−β

1*

(dicrepancy)

What is wrong just using the discrepancy above? How close is close?

(30)

f. Hypothesis Testing

t statistic:

The basic intuition is that if the null is true then the t statistic should be small (in absolute value).

Get worried when t is large!

t

=

b

1

−β

1

*

b

1

(31)

f. Hypothesis Testing

Formal Approach to Hypothesis-Testing: Two Steps:

i. Pick the significance level (α) = Prob(reject when null true) by deciding what level of error of this kind is acceptable

(called type I error).

ii. Use α to choose a rejection region – the set of t statistic

values which will lead to a rejection. This is done by picking a

critical value, such that there is α/2 area in the tails to the right and left of t* and -t*…

*

2 / , 2 N

(32)

f. Hypothesis Testing

This is exactly the same problem as picking the cut-off value in setting up the confidence interval!

-4 -2 0 2 4

0 .0 0 .1 0 .2 0 .3 0 .4 * 2 / , 2 N

t

α

-Rejection region with total probability = α

(33)

-f. Hypothesis Testing

In practice, we take a value of α to be around .05 unless:

Sample population is small, or is large

Cost of making a type I error is large

(type I error = reject null when null is true)

In summary, we reject if (otherwise, fail to reject):

- a

- b

=

N 2, / 2 1

*

*

1 1

b

b

t

t

(34)

g. Example: Market Model and Hypothesis Testing

Even though we know it to be false, let’s hypothesize that there is no relationship between the Windsor Mutual Fund and the Market.

H0: β1 = 0 HA: β1 ≠ 0

slope estimate calculated std error t stat = 32.1 = (.93572-0)/.02915

of slope coef

 

hypothesized value

            

(35)

g. Example: Market Model and Hypothesis Testing

The t value is huge (32) relative to the null t distribution with 180-2 =178 degrees of freedom!

To illustrate just how big this is, let’s simulate some numbers from the t distribution

In 1000 draws from the t distribution, we didn’t get a single value anywhere near 32.

(36)

g. Example: Market Model and Hypothesis Testing

Now let’s test a more relevant value of β1

In finance, stocks and portfolios are characterized by their betas which are estimated from regressions very similar to this one. β1 is sometimes used as an estimate of risk or volatility. The value of 1 has central

significance.

β1 > 1: volatile assets (amplify market up/down moves)

β1 < 1: non-volatile assets (shrink market movements)

This suggests that we consider the hypothesis that β1 = 1. H0: β1 = 1

(37)

g. Example: Market Model and Hypothesis Testing

We can use the qnorm (inverse of CDF) command to compute this for us.

We want the value of t statistic so that:

Pr[ | t | > t* ] = .05

-4 -2 0 2 4

0.

0

0.

1

0.

2

0.

3

0.

4

0.025 Area

Find the critical value, t* for the .05 significance level.

Value such that Pr[t < t*] = 0.025

(38)

g. Example: Market Model and Hypothesis Testing

Now compute the value of t stat:

This is larger than the 95% critical value for t(178) so we reject the null hypothesis: H0: β1 = 1

Estimate of b1

Hypo Value

Std Error

(39)

g. Example: Market Model and Hypothesis Testing

Let’s look at the intercept for the Windsor fund regression. Remember this is Jensen’s alpha.

H0: β0 = 0 HA: β0 ≠ 0

            

We can see that the intercept is significantly different from zero. However, 95 CI is large:

(40)

h. P Values

One of the problems with formal hypothesis-testing is that the strength of information in the data in support/against null is not conveyed by accept/ reject!

t value is a tiny bit less than the t cutoff, we accept the nullIf the t value is a tiny bit bigger we reject the null

The information from the data is pretty much the same but we act quite differently.

Therefore, we need some measure of the strength of rejection. The

(41)

h. P Values

The p-value is the probability of observing a value of the t statistic farther out in the tail than the observed t value.

-

=

N 2

p

Pr t

t

Observed t-stat value

For the standard t-tests printed out by R, the p-value is

(42)

h. P Values

Let’s compute p for the mutual funds example of testing

β1 = 1. pt() is the R function for the CDF of a t distribution.

This means that 2.7% of the area of the null distribution is greater than 2.2 and less than -2.22 or:

Pr[t180-2 ≥ 2.22] + Pr[t180-2 ≤ -2.22] = 0.0275

(43)

h. P Values

Small p value (< α) large | t | reject

Large p value (≥ α) small | t | accept null

(44)

Appendix 1: Sampling Distribution of b0, cov(b1,b0)





-+

=

2 X 2 2 0

s

)

1

N

(

X

N

1

)

b

(

Var

cov b( 0,b1) = −σ2 X

N −1

( )2X

⎜ ⎞

⎠ ⎟

(45)

Appendix 2: Derivation of Var(b1)

First, let’s write b1 as a linear combination of the Y’s. This makes b1 very much like a weird sort of sample average.

To make things easier to read, use one symbol for the denominator.

Now we can see that b1 is a linear combination of the Y’s. We will call the

weights, ci.

(46)

Appendix: Derivation of Var(b1)

These are not the same weights as in the sample average (1/N vs. ci). Let’s observe some simple properties of the ci.

Property 1:

Weights sum up to zero. Observations farther from receive larger weights. Remember: just because something sums to 0 doesn’t mean that it is all zeroes!

Property 2:

(

)

(

)

0 X

X D

1 D

X X

ci i

i

=

- = - =
(47)

Appendix: Derivation of Var(b1)

Derivation of Var(b1)

Now that we know a few things about the ci weights, we can easily derive the variance formula.

Now we use:

i. the fact that, given Xi , the Yi are independent

ii. the formula from Math-Stat prereq on the variance of a l. c. of indep r.v.s

recalling that, we get the, by now, familiar formula!

Var(b1) = V ar(

ciYi)
(48)

Glossary of Symbols

s - standard error of the regression ci - weights used to compute b1

D - denominator of ci weight

- standard errors of least squares estimates

α - significance level

tN-2 - t random variable with N-2 degrees of freedom

tN-2, α/2 - t critical value for t with N-2 df and significance level α

sb

(49)

Important Equations

2

N

SSE

e

2

N

1

s

N 1 i 2 i 2

-=

-=

=

=

-SSE

s

N 2

estimate of error variance standard error of the regression three factors driving sampling variance of slope

Var b

( ) =

1

σ

2

X

i

X

(

)

i=1 N

2

=

σ

2

N

1

(50)

Important Equations 2 X 2 b s ) 1 N ( s s

1 =

-(1

−α

)% C.I.: b ± t

N2,α/2

b

- a

- b

=

N 2, / 2 1 * * 1 1 b

b

t

t

s

std errors of coefs Confidence Interval Rejection Region for
(51)

Important Equations

-

=

N 2

(52)

Glossary of R Commands

hist(a): Graphs a histogram of the given data values of

the variable a.

pf(t stats,df=10): Returns the p-value of a t statistic

with degrees of freedom = 10

qt(prob,df=10): Returns the t statistics for the left-tail

probability of a t distribution with degree of freedom = 10.

rt(100,df=10): Generates random 100 numbers for

the t distribution with degree of freedom of 10.

References

Related documents

The methodology for the automatic detection and monitoring of plumes caused by major technological accidents on NOAA/AVHRR (Advanced Very High Resolution Radiometer) imagery, uses

In this model very structured operational optimized data remains will be stored and analyzed the data warehouse, while data that are highly distributed

In consideration of clinical history of the patient, the calculated time of operation, the advantages in this case of a spinal anesthesia and after a multidisciplinary evaluation

Morgan, EH; Vatucawaqa, P; Snowdon, W; Worsley, A; Dangour, AD; Lock, K (2016) Factors influencing fruit and vegetable intake among urban Fijians: A qualitative study...

Think about your farm business and list all of the assets you own and/or control that make up the farm business.. This includes both the physical assets (land, livestock,

capacity for Astemizole binding to crude membranes iso- lated from yeast cells induced for hERG production at 15 ◦ C. Astemizole was selected because it is a known spe- cific

The CPS plays full part in LCJB actions to address local priorities The CPS performance on MLA continues to improve The CPS is seen as a centre of excellence on mutual legal