Lecture_6 [Modo de compatibilidad]

(1)

Techniques of Statistical

Analysis I

Lect_6: Test of hypothesis

Bruno Arpino

(2)

Formulate null and alternative hypotheses to test

social sciences theories

Formulate a decision rule for testing a hypothesis on

Population mean or proportion

Outline

2

Population mean or proportion

Difference between two population means or proportions

How to use the critical value and p-value approaches

to test the null hypothesis

(3)

Use statistical methods to test hypotheses such as

“The average monthly salary in Spain is 1,200 Euros”

(population mean)

“For treating anorexia, cognitive behavioral and family

Goal

3

“For treating anorexia, cognitive behavioral and family

therapies have same mean weight change as placebo”

(no effect; difference btw two means)

“The proportion of people with mental health problems is

higher in households with low socioeconomic status

(4)

The research process

Statistics

4

Political scientists, demographers, sociologists, economists, etc generate

theories on the basis of their opinions and observations. From these theories they derive specific hypotheses.

(5)

Theory

Specific Hypothesi(e)s

Data + Statistical decision rule

We use a decision rule to choose between:

Statistical test of hypothesis: procedure

5

We use a decision rule to choose between:

Null hypothesis (H0): A statement that parameter(s)

take specific value(s) (Usually: “no effect”)

Alternative hypothesis (Ha)

: states that parameter

value(s) falls in some alternative range of values (an

“effect”)

(6)

The null hypothesis usually refers to a status-quo

situation and the alternative corresponds to a change (in

some cases choosing Ha might imply implementing a new

policy or program).

Null and alternative hypothesis

6

E.g., we know that till last year in Spain the average

salary was 1,300 Euros. We want to test the hypothesis

that because of the crisis, this year the average salary is

decreased:

(7)

Another example: the Ayuntamiento of Barcelona knows

that the percentage of users of the Bicing system is 20%.

In order to increase this percentage a new (more costly)

type of bike is introduced experimentally in some barrios

to test if having a better bike increases the percentage of

Null and alternative hypothesis (cont’d)

7

to test if having a better bike increases the percentage of

users. A survey should we conducted to test whether the

percentage of users increased:

H

₀

:

π

≤

0.20

(8)

In both we should provided

“strong evidence”

in favor

of Ha! Rejecting H0 would suggest a change in

socio-economic policies, in the first case, and the adoption of

the new bikes in the whole city, in the second case.

Null and alternative hypothesis (cont’d)

H :

π

≤

0.20 H :

µ ≥

1300

8

The statistical decision rule should begin with the

assumption that the null hypothesis is true.

Process analogous several judicial systems (innocent until

proven guilty)

H0: Defendant is innocent vs Ha: Defendant is guilty

H

₀

:

π

≤

0.20 H

_a

:

π

> 0.20

(9)

Only men that never decide never make mistakes!

Each decision might imply an error!

α

= significance level of the test 1-β = power of the test

9

Type I error: “false positive” in medical tests

We want to be conservative about H0, so the test

procedure starts by fixing

α

at a very low level (1%; 5%;

10%)

(10)

Is there

strong evidence

against H0?

Test of Hyp on a mean: the idea

10 We need to decide

which values of the sample mean are “plausible” given the hypothesis on the population mean

(11)

Acceptance region = the set of values of the sample statistic that do not allow us to reject H0

Acceptance and rejection regions

11

Usually, we define acceptance and rejection regions after having

standardized X

µ = 50

If H₀ is true

20

Acceptance region Rejection region

1 -

α

(12)

Depending on how H₁ is specified, the area corresponding to the significance level changes position!

The critical value(s) is (are) the value(s) that limits the rejection area.

Acceptance and rejection regions (cont’d)

(13)

We assume X is normal (or large sample size) and population standard deviation is known

Test of Hypothesis for the Mean (σ Known)

(14)

Decision rule (in general): reject if the value of the test statistic (z) falls into the rejection region.

In this case:

Test of Hyp for the Mean (σ Known) (cont’d)

The test statistic measures how many standard errors the

sample mean falls from the H0 value.

(15)

In 2010, adults people in Barcelona went to the cinema, on

average, 52 times. Data for the 2011 are not yet available, but

the Ayuntamiento thinks that the average number of times

people go to the cinema has increased. To test this

hypothesis, data have been collected on a sample of 64 adults

residing in Barcelona. On this sample, the average was 53.1.

Test of Hyp for the Mean (σ Known): Example

residing in Barcelona. On this sample, the average was 53.1.

Assuming that the distribution of the variable “number of

times people go to the cinema in a year” is normal with

σ

=

10, test the claim of the Ayuntamiento at the 10% significance

level. (Consider that z

_0.05

= 1.645; z

_0.10

= 1.28)

(16)

First step: formalize the hypothesis system:

Second step: find the rejection region:

Example (cont’d): rejection region

H₀: µ ≤ 52 the average is not higher than 52 times per year

H₁: µ > 52 the average is greater than 52 times per year (i.e., sufficient evidence exists to support the

Ayuntamiento’s claim)

Second step: find the rejection region:

From the system of hypothesis

we can notice that we have

an upper-tail test.

Since we have an upper tail test

and

α

= 0.10, the critical value

is z

_α

= z

_0.10

=1.28.

Reject if z > 1.28!

(17)

Thirs step: calculate the value of the test statistic (z):

We know that: n = 64, x = 53.1;

σ

=10. So, z is:

Example (cont’d): test statistic and decision

0.88

64 10

52 53.1

n

σ µ

x

z = − 0 = − =

Fourth step: compare the value of the test statistic (z) and the

critical value to get a decision

17

Since z = 0.88 < z_α = 1.28, the value of the test statistic falls in the acceptance region. So, we do not reject H0.

This means that there is no

(18)

Very similar to the case of σ known. But, here we substitute σ with its estimate, s = sample standard deviation. As for the CI for the mean with σ unknown, the

distribution of the sample mean is not normal and we have to use the t distribution.

The test statistic is:

and the critical values are t-values:

Test of Hypothesis for the Mean (σ unknown)

18 and the critical values are t-values:

(19)

In a report, the French Ministry of Welfare claims that the

average cost to raise a child aged 0-2 is 168 Euros per

month.

You are suspicious about this claim and from the birth

register of 5 randomly drawn hospitals you draw a sample of

25 households with a baby aged 0-2. On the sample, the

Test for the Mean (σ unknown): Example

25 households with a baby aged 0-2. On the sample, the

average expenditure to raise children per month was 172.50

Euros, while s = 15.40 Euros. Test the Ministry claim at the

0.05 significance level.

(Consider that: z

_0.025

= 1.96; z

_0.05

= 1.645; t

_{24, 0.025}

= 2.064;

t

_{24, 0.05}

= 1.711)

(20)

Example (cont’d)

(21)

Example (cont’d)

(22)

Idea and procedure similar to the tests for the mean.

As for the CI, also here we assume to have a big sample (see rule of thumbs for CI) and we always use a normal approximation for the distribution of the sample proport.

The test statistic is:

and the critical values are z-values:

Test of Hypothesis for the proportion

n ) (1 -p 0 0 0

π

− = z 22 and the critical values are z-values:

H₀: π _≤_0.08

H₁: π >>>> 0.08

H₀: π ≥ 0.08

H₁: π <<<< _0.08

α/2 z α z α/2 z -α z

-H₀: π _{= 0.08}

(23)

Compare sample statistic value with hypothesised value of population parameter by using test statistic

We define a region of values that are implausible under H₀:

rejection region. Its limits are called critical values.

The critical value approach: a summary

error standard H under parameter population of value -statistic sample of value statistic

test = 0

rejection region. Its limits are called critical values.

General decision rule: If the value of the test statistic for our sample is within the rejection region we reject H₀

23 0 H reject region Rejection statistic test of

value ∈ →

α/2 z α/2 z -α

z -z_α

(24)

Alternative but consistent with the critical value approach

P-value is a probability measure of evidence about H

₀

:

“the probability of obtaining a test statistic more extreme (in

absolute value) than the observed sample value of the test

statistic given H

₀

is true”.

The p-value approach

Also called observed level of significance because it

corresponds to the smallest value of

αααα

for which H

₀

can be

rejected

The smaller the P-value, the stronger the evidence against

H

₀

.

24

(

test statistic value test statistic assumedin thesample

)

prob value

-p = ≥

0 H reject level ce significan value -p

(25)

Consider again the example of slide 15: In 2010, adults

people in Barcelona went to the cinema, on average, 52

times. Data for the 2011 are not yet available, but the

Ayuntamiento thinks that the average number of times

people go to the cinema has increased. To test this

hypothesis, data have been collected on a sample of 64

The p-value approach: an example

hypothesis, data have been collected on a sample of 64

adults residing in Barcelona. On this sample, the average was

53.1. Assuming that the distribution of the variable “number of

times people go to the cinema in a year” is normal with

σ

=

10, test the claim of the Ayuntamiento at the 10%

significance level. (Consider that z

_0.05

= 1.645; z

_0.10

= 1.28)

(26)

We calculated the value the test statistic in the sample in this way:

We ask: assuming that H₀ is true, how likely was to obtain values more extreme than 0.88? (Values more in the direction of the

P-value example (cont’d)

0.88 64 10 52 53.1 n σ µ x

z = − 0 = − =

more extreme than 0.88? (Values more in the direction of the alternative hyp.)

26

Under H₀ the probability to obtain values bigger than 0.88 was about 19%.

To reject H₀, we should choose a significance level, α, such

z > z_α but this would imply α > 19% which cannot be accepted! If α is set to any level that can be accepted (i.e. ≤ 10%) z < z_α so we will not reject H_0.

Z = .88

(27)

For example, if we set α = 0.10, z_α = 1.28 and so z = 0.88 < 1.28 and we do not reject H₀

P-value example (cont’d)

In

general,

if p-value > α

we do not reject H

₀

if p-value

≤

α

we reject H

₀

(28)

A sociological study claims that the percentage of people that define themselves as extremely satisfied with their lives is 8%.

To test this claim, a random sample of 500 persons were surveyed

and 25 of them said to be extremely satisfied. On the basis of this sample results, test at the αααα = 0.05 significance level the claim that the % in the population is 8% against the alternative that the true

Exercise

the % in the population is 8% against the alternative that the true population is different.

Use both the critical value and the p-value approach. (Note that:

z_0.025 = 1.96; z_0.05 = 1.645 and p-value = 0.0136)

(29)

Let first use the critical value approach.

First step: write the system of hypotheses:

Second step: find the rejection region.

αααα = 0.05 but we have a two-tail test so: the critical values are

±z_0.025 = ± 1.96. The rejection region is: (- ∞, -1.96) U (1.96, +∞).

Exercise (cont’d)

H₀: π _{= 0.08}

H₁: π ≠≠≠≠ 0.08

±z_0.025 = ± 1.96. The rejection region is: (- ∞, -1.96) U (1.96, +∞).

Third step: calculate test statistic:

Fourth step: decide btw Hypotheses

z = -2.47 belongs to the rejection region. So, we reject H₀: There is sufficient evidence to reject the claim that the % of extremely

satisfied persons is 8%. 29

47 . 2 500 0.08) 0.08(1 0.08 -0.05 n ) (1 -p 0 0

0 = −

(30)

Now use the p-value approach.

First step: write the system of hypotheses:

Second step: calculate the value of the test statistic and the p-value (I will always give you the p-p-value. Here p-p-value = 0.0136)

Exercise (cont’d)

H₀: π = 0.08

H₁: π ≠≠≠≠ 0.08

Third step: compare the p-value with the significance level to decide btw the two hypotheses:

p-value = 0.0136 < αααα = 0.05, so we reject H₀: There is sufficient evidence to reject the claim that the % of extremely satisfied persons is 8%.

(Assuming H₀is true, the probability to have values of the test statistic not smaller than -2.47 is 1.36%. We could set the

(31)

Some examples of tests of hypothesis to

compare two population parameters

31

compare two population parameters

(32)

Test of hypothesis for the difference between

two means

(33)

A study in Alachua County, Florida, investigated the relationship between mental health and stressful life events.

Researchers developed an index of mental impairment, which incorporates various dimensions of psychiatric symptoms, including aspects of anxiety and depression. This measure, ranged from 17

Example of test of hypothesis for the

difference between two means

aspects of anxiety and depression. This measure, ranged from 17 to 41 in the sample. Higher scores indicate greater psychiatric impairment.

They also built a life events score composite measure of both the number and severity of major life events the subject

experienced within the past three years. These events range from severe personal disruptions such as a death in the family, a jail sentence, or an extramarital affair, to less severe events such as getting a new job, the birth of a child, moving within the same city, or having a child marry. On the basis of this score, people were

(34)

Investigators were interested in understanding to what extent people that experience more stressful events (“high stress”) have lower average mental score.

Formally:

Example of test of hypothesis for the

difference between two means (cont’d)

In their sample, they found that the difference among the average mental impairment score between the two groups was equal to 3.2

The t-statistic was equal to 2.040 while the t-critical value defining

the rejection region was 1.676 (α = 5%). The p-value was 0.0225.

What ‘s the conclusion they should get?

34

H

₀

: µ

_H

- µ

_L

≤ 0 i.e. (µ

_H

≤

µ

_L

)

where H = high stress

(35)

We have sufficient data to use either the critical value or the p-value approach.

Example of test of hypothesis for the

difference between two means (cont’d)

H

₀

: µ

_H

- µ

_L

≤ 0 i.e. (µ

_H

≤

µ

_L

)

where H = high stress

H

₁

: µ

_H

- µ

_L

> 0 i.e. (µ

_H

>

µ

_L

)

L = low stress

value approach.

With the critical value approach: the rejection region is (1.676,

+∞). t = 2.040 falls in the rejection region we reject H₀there is sufficient evidence to claim that those

who experienced more stressful event

have higher mental impairment, on average.

Using the p-value approach we reach the same

conclusion: p-value = 0.0225 < 0.05 so we reject H₀.

(36)

A group of political analysts want to understand if there is a difference between the proportion of men and the proportion of women who will vote Yes on Proposition A

In a random sample, 36 of 72 men and 31 of 50 women indicated they would vote Yes.

They set a level of significance of 0.05 and the critical value defining

Example of test of hypothesis for the

difference between two proportions

They set a level of significance of 0.05 and the critical value defining the rejection region was 1.96. The value of the z-statistic was -1.31 and the p-value was 0.19.

They also estimated a confidence interval at the 95% level: (-0.297, 0.057).

1. Write the system of hypotheses

2. Define the rejection region

3. Decide if H₀ should be rejected using both the critical value and the p-value approach.

(37)

1.

Write the system of hypothesis

2.

Define the rejection region: (-∞, -1.96) U (+1.96, + ∞)

Example of test of hypothesis for the

difference between two proportions (cont’d)

H₀: π_M – π_W = 0 (the two proportions are equal)

H₁: π _M – π_W ≠ 0 (the two proportions are different)

2.

Define the rejection region: (-∞, -1.96) U (+1.96, + ∞)

3.

Decide if H₀ should be rejected using both the critical value and the p-value approach.

Crit. Value approach: z = -1.31 does not fall in the rejection region we do not reject H0

P-value approach:

p-value = 0.19 > 0.05 do not reject H0

(38)

4.

Could you have reached the same conclusion on the basis of the CI?.

Example of test of hypothesis for the

difference between two proportions (cont’d)

CI?.

Yes: we are 95% confident that the difference between the two proportions is between -0.297 and 0.057. This means that it’s both plausible that the proportion of “Yes” is higher among women

(negative value of the difference) and that the proportion is higher among men (positive value of the difference).

(39)

(this holds also for the comparison between two means)

Relationship between confidence interval and

test of hypothesis

When the CI at (1-

α

)

% confidence level includes the “0” then the corresponding two-tails test of the hypothesis that the two proportions are different is NOT rejected at

α

_%

_{level of significance.}

(40)

Relationship between confidence interval and

test of hypothesis (cont’d)

In general, if a CI includes a given value, we would not reject the hypothesis that the parameter is equal to that value.

This relationship is not unexpected because both CI and test of hypothesis use similar quantities.