Techniques of Statistical
Analysis I
Lect_6: Test of hypothesis
Bruno Arpino
Formulate null and alternative hypotheses to test
social sciences theories
Formulate a decision rule for testing a hypothesis on
Population mean or proportion
Outline
2
Population mean or proportion
Difference between two population means or proportions
How to use the critical value and p-value approaches
to test the null hypothesis
Use statistical methods to test hypotheses such as
“The average monthly salary in Spain is 1,200 Euros”
(population mean)
“For treating anorexia, cognitive behavioral and family
Goal
3
“For treating anorexia, cognitive behavioral and family
therapies have same mean weight change as placebo”
(no effect; difference btw two means)
“The proportion of people with mental health problems is
higher in households with low socioeconomic status
The research process
Statistics
4
Political scientists, demographers, sociologists, economists, etc generate
theories on the basis of their opinions and observations. From these theories they derive specific hypotheses.
Theory
Specific Hypothesi(e)s
Data + Statistical decision rule
We use a decision rule to choose between:
Statistical test of hypothesis: procedure
5
We use a decision rule to choose between:
Null hypothesis (H0): A statement that parameter(s)
take specific value(s) (Usually: “no effect”)
Alternative hypothesis (Ha)
: states that parameter
value(s) falls in some alternative range of values (an
“effect”)
The null hypothesis usually refers to a status-quo
situation and the alternative corresponds to a change (in
some cases choosing Ha might imply implementing a new
policy or program).
Null and alternative hypothesis
6
E.g., we know that till last year in Spain the average
salary was 1,300 Euros. We want to test the hypothesis
that because of the crisis, this year the average salary is
decreased:
Another example: the Ayuntamiento of Barcelona knows
that the percentage of users of the Bicing system is 20%.
In order to increase this percentage a new (more costly)
type of bike is introduced experimentally in some barrios
to test if having a better bike increases the percentage of
Null and alternative hypothesis (cont’d)
7
to test if having a better bike increases the percentage of
users. A survey should we conducted to test whether the
percentage of users increased:
H
0:
π
≤
0.20
In both we should provided
“strong evidence”
in favor
of Ha! Rejecting H0 would suggest a change in
socio-economic policies, in the first case, and the adoption of
the new bikes in the whole city, in the second case.
Null and alternative hypothesis (cont’d)
H :
π
≤
0.20
H :
µ ≥
1300
8
The statistical decision rule should begin with the
assumption that the null hypothesis is true.
Process analogous several judicial systems (innocent until
proven guilty)
H0: Defendant is innocent vs Ha: Defendant is guilty
H
0:
π
≤
0.20
H
a:
π
> 0.20
Only men that never decide never make mistakes!
Each decision might imply an error!
α
= significance level of the test 1-β = power of the test9
Type I error: “false positive” in medical tests
We want to be conservative about H0, so the test
procedure starts by fixing
α
at a very low level (1%; 5%;
10%)
Is there
strong evidence
against H0?
Test of Hyp on a mean: the idea
10 We need to decide
which values of the sample mean are “plausible” given the hypothesis on the population mean
Acceptance and rejection regions
11
Usually, we define acceptance and rejection regions after havingstandardized X
µ = 50
If H0 is true
20
Acceptance region Rejection region
1 -
α
Acceptance and rejection regions (cont’d)
Test of Hypothesis for the Mean (σ Known)
Test of Hyp for the Mean (σ Known) (cont’d)
The test statistic measures how many standard errors thesample mean falls from the H0 value.
In 2010, adults people in Barcelona went to the cinema, on
average, 52 times. Data for the 2011 are not yet available, but
the Ayuntamiento thinks that the average number of times
people go to the cinema has increased. To test this
hypothesis, data have been collected on a sample of 64 adults
residing in Barcelona. On this sample, the average was 53.1.
Test of Hyp for the Mean (σ Known): Example
residing in Barcelona. On this sample, the average was 53.1.
Assuming that the distribution of the variable “number of
times people go to the cinema in a year” is normal with
σ
=
10, test the claim of the Ayuntamiento at the 10% significance
level. (Consider that z
0.05= 1.645; z
0.10= 1.28)
First step: formalize the hypothesis system:
Second step: find the rejection region:
Example (cont’d): rejection region
H0: µ ≤ 52 the average is not higher than 52 times per year
H1: µ > 52 the average is greater than 52 times per year (i.e., sufficient evidence exists to support the
Ayuntamiento’s claim)
Second step: find the rejection region:
From the system of hypothesis
we can notice that we have
an upper-tail test.
Since we have an upper tail test
and
α
= 0.10, the critical value
is z
α= z
0.10=1.28.
Reject if z > 1.28!
Thirs step: calculate the value of the test statistic (z):
We know that: n = 64, x = 53.1;
σ
=10. So, z is:
Example (cont’d): test statistic and decision
0.88
64 10
52 53.1
n
σ µ
x
z = − 0 = − =
Fourth step: compare the value of the test statistic (z) and the
critical value to get a decision
17
Since z = 0.88 < zα = 1.28, the value of the test statistic falls in the acceptance region. So, we do not reject H0.
This means that there is no
Very similar to the case of σ known. But, here we substitute σ with its estimate, s = sample standard deviation. As for the CI for the mean with σ unknown, the
distribution of the sample mean is not normal and we have to use the t distribution.
The test statistic is:
and the critical values are t-values:
Test of Hypothesis for the Mean (σ unknown)
18 and the critical values are t-values:
In a report, the French Ministry of Welfare claims that the
average cost to raise a child aged 0-2 is 168 Euros per
month.
You are suspicious about this claim and from the birth
register of 5 randomly drawn hospitals you draw a sample of
25 households with a baby aged 0-2. On the sample, the
Test for the Mean (σ unknown): Example
25 households with a baby aged 0-2. On the sample, the
average expenditure to raise children per month was 172.50
Euros, while s = 15.40 Euros. Test the Ministry claim at the
0.05 significance level.
(Consider that: z
0.025= 1.96; z
0.05= 1.645; t
24, 0.025= 2.064;
t
24, 0.05= 1.711)
Example (cont’d)
Example (cont’d)
Idea and procedure similar to the tests for the mean.
As for the CI, also here we assume to have a big sample (see rule of thumbs for CI) and we always use a normal approximation for the distribution of the sample proport.
The test statistic is:
and the critical values are z-values:
Test of Hypothesis for the proportion
n ) (1 -p 0 0 0
π
π
π
− = z 22 and the critical values are z-values:H0: π ≤ 0.08
H1: π >>>> 0.08
H0: π ≥ 0.08
H1: π <<<< 0.08
α/2 z α z α/2 z -α z
-H0: π = 0.08
rejection region. Its limits are called critical values.
The critical value approach: a summary
error standard H under parameter population of value -statistic sample of value statistic
test = 0
rejection region. Its limits are called critical values.
General decision rule: If the value of the test statistic for our sample is within the rejection region we reject H023 0 H reject region Rejection statistic test of
value ∈ →
α/2 z α/2 z -α
z -zα
Alternative but consistent with the critical value approach
P-value is a probability measure of evidence about H
0:
“the probability of obtaining a test statistic more extreme (in
absolute value) than the observed sample value of the test
statistic given H
0is true”.
The p-value approach
Also called observed level of significance because it
corresponds to the smallest value of
αααα
for which H
0can be
rejected
The smaller the P-value, the stronger the evidence against
H
0.
24
(
test statistic value test statistic assumedin thesample)
prob value
-p = ≥
0 H reject level ce significan value -p
Consider again the example of slide 15: In 2010, adults
people in Barcelona went to the cinema, on average, 52
times. Data for the 2011 are not yet available, but the
Ayuntamiento thinks that the average number of times
people go to the cinema has increased. To test this
hypothesis, data have been collected on a sample of 64
The p-value approach: an example
hypothesis, data have been collected on a sample of 64
adults residing in Barcelona. On this sample, the average was
53.1.
Assuming that the distribution of the variable “number of
times people go to the cinema in a year” is normal with
σ
=
10, test the claim of the Ayuntamiento at the 10%
significance level. (Consider that z
0.05= 1.645; z
0.10= 1.28)
P-value example (cont’d)
0.88 64 10 52 53.1 n σ µ x
z = − 0 = − =
more extreme than 0.88? (Values more in the direction of the alternative hyp.)
26
Under H0 the probability to obtain values bigger than 0.88 was about 19%.
To reject H0, we should choose a significance level, α, such
z > zα but this would imply α > 19% which cannot be accepted! If α is set to any level that can be accepted (i.e. ≤ 10%) z < zα so we will not reject H0.
Z = .88
P-value example (cont’d)
In
general,
if p-value > α
we do not reject H
0if p-value
≤
α
we reject H
0and 25 of them said to be extremely satisfied. On the basis of this sample results, test at the αααα = 0.05 significance level the claim that the % in the population is 8% against the alternative that the true
Exercise
the % in the population is 8% against the alternative that the true population is different.
Use both the critical value and the p-value approach. (Note that:z0.025 = 1.96; z0.05 = 1.645 and p-value = 0.0136)
αααα = 0.05 but we have a two-tail test so: the critical values are
±z0.025 = ± 1.96. The rejection region is: (- ∞, -1.96) U (1.96, +∞).
Exercise (cont’d)
H0: π = 0.08
H1: π ≠≠≠≠ 0.08
±z0.025 = ± 1.96. The rejection region is: (- ∞, -1.96) U (1.96, +∞).
Third step: calculate test statistic: Fourth step: decide btw Hypothesesz = -2.47 belongs to the rejection region. So, we reject H0: There is sufficient evidence to reject the claim that the % of extremely
satisfied persons is 8%. 29
47 . 2 500 0.08) 0.08(1 0.08 -0.05 n ) (1 -p 0 0
0 = −
Exercise (cont’d)
H0: π = 0.08
H1: π ≠≠≠≠ 0.08
Third step: compare the p-value with the significance level to decide btw the two hypotheses:p-value = 0.0136 < αααα = 0.05, so we reject H0: There is sufficient evidence to reject the claim that the % of extremely satisfied persons is 8%.
(Assuming H0 is true, the probability to have values of the test statistic not smaller than -2.47 is 1.36%. We could set the
Some examples of tests of hypothesis to
compare two population parameters
31
compare two population parameters
Test of hypothesis for the difference between
two means
Example of test of hypothesis for the
difference between two means
aspects of anxiety and depression. This measure, ranged from 17 to 41 in the sample. Higher scores indicate greater psychiatric impairment.
They also built a life events score composite measure of both the number and severity of major life events the subjectexperienced within the past three years. These events range from severe personal disruptions such as a death in the family, a jail sentence, or an extramarital affair, to less severe events such as getting a new job, the birth of a child, moving within the same city, or having a child marry. On the basis of this score, people were
Example of test of hypothesis for the
difference between two means (cont’d)
In their sample, they found that the difference among the average mental impairment score between the two groups was equal to 3.2 The t-statistic was equal to 2.040 while the t-critical value definingthe rejection region was 1.676 (α = 5%). The p-value was 0.0225.
What ‘s the conclusion they should get?34
H
0: µ
H- µ
L≤ 0 i.e. (µ
H≤
µ
L)
where H = high stressExample of test of hypothesis for the
difference between two means (cont’d)
H
0: µ
H- µ
L≤ 0 i.e. (µ
H≤
µ
L)
where H = high stressH
1: µ
H- µ
L> 0 i.e. (µ
H>
µ
L)
L = low stressvalue approach.
With the critical value approach: the rejection region is (1.676,+∞). t = 2.040 falls in the rejection region we reject H0 there is sufficient evidence to claim that those
who experienced more stressful event
have higher mental impairment, on average.
Using the p-value approach we reach the sameconclusion: p-value = 0.0225 < 0.05 so we reject H0.
A group of political analysts want to understand if there is a difference between the proportion of men and the proportion of women who will vote Yes on Proposition A
In a random sample, 36 of 72 men and 31 of 50 women indicated they would vote Yes.
They set a level of significance of 0.05 and the critical value defining
Example of test of hypothesis for the
difference between two proportions
They set a level of significance of 0.05 and the critical value defining the rejection region was 1.96. The value of the z-statistic was -1.31 and the p-value was 0.19.
They also estimated a confidence interval at the 95% level: (-0.297, 0.057).
1. Write the system of hypotheses
2. Define the rejection region
3. Decide if H0 should be rejected using both the critical value and the p-value approach.
1.
Write the system of hypothesis2.
Define the rejection region: (-∞, -1.96) U (+1.96, + ∞)Example of test of hypothesis for the
difference between two proportions (cont’d)
H0: πM – πW = 0 (the two proportions are equal)
H1: π M – πW ≠ 0 (the two proportions are different)
2.
Define the rejection region: (-∞, -1.96) U (+1.96, + ∞)3.
Decide if H0 should be rejected using both the critical value and the p-value approach.Crit. Value approach: z = -1.31 does not fall in the rejection region we do not reject H0
P-value approach:
p-value = 0.19 > 0.05 do not reject H0
4.
Could you have reached the same conclusion on the basis of the CI?.Example of test of hypothesis for the
difference between two proportions (cont’d)
H0: πM – πW = 0 (the two proportions are equal)
H1: π M – πW ≠ 0 (the two proportions are different)
CI?.
Yes: we are 95% confident that the difference between the two proportions is between -0.297 and 0.057. This means that it’s both plausible that the proportion of “Yes” is higher among women
(negative value of the difference) and that the proportion is higher among men (positive value of the difference).
(this holds also for the comparison between two means)
Relationship between confidence interval and
test of hypothesis
H0: πM – πW = 0 (the two proportions are equal)
H1: π M – πW ≠ 0 (the two proportions are different)
When the CI at (1-
α
)
% confidence level includes the “0” then the corresponding two-tails test of the hypothesis that the two proportions are different is NOT rejected atα
%
level of significance.Relationship between confidence interval and
test of hypothesis (cont’d)
In general, if a CI includes a given value, we would not reject the hypothesis that the parameter is equal to that value. This relationship is not unexpected because both CI and test of hypothesis use similar quantities. Consider the single population case for the mean with population40
Consider the single population case for the mean with populationstandard deviation known
E.g., if CI at 95% is (2.5, 8.5) H0: µ = 7 will not be rejectedat 5% level.
( )
+
−
=
n
σ
z
x
,
n
σ
z
x
µ
CI
1-α α/2 α/2α/2 0 0
z
n
σ
-x
z
if
H
reject
not
If something is not clear
(or you find mistakes in the slides)
41