• No results found

3.4 Statistical inference for 2 populations based on two samples

N/A
N/A
Protected

Academic year: 2021

Share "3.4 Statistical inference for 2 populations based on two samples"

Copied!
63
0
0

Loading.... (view fulltext now)

Full text

(1)

3.4 Statistical inference for 2 populations based on two samples

Tests for a difference between two population means The first sample will be denoted as X1, X2, . . . , Xm. The second sample will be denoted as Y1, Y2, . . . , Yn. The population means of the populations from which these samples are taken are denoted by µX and µY, respectively.

(2)

Statistical inference for 2 populations based on two samples

We consider 2 cases

1. Dependent samples - in this case we have n pairs of observations (X1, Y1), . . . , (Xn, Yn).

2. Two unrelated samples (X1, . . . , Xm) and (Y1, . . . , Yn).

(3)

3.4.1 Tests for the difference between two population means - dependent samples

Each pair of observations comes from either one individual under two different conditions

e.g. the weights of a group before (X ) and after (Y ) a diet.

or from two related sources

e.g. the height of a father (X ) and his son (Y ).

(4)

Tests for the difference between two population means - dependent samples

We wish to test the hypothesis

H0: µX = µY

In the first example, this hypothesis states that the diet has no effect on weight.

In the second example, this hypothesis states that on average fathers are as tall as their sons.

(5)

One and two-sided tests

As before, we consider both one and two-sided tests. In a two-sided test the alternative is

HA: µX 6= µY

In the first example this states that the diet has some effect on weight.

In the second example this states that the average height of fathers differs from the average height of their sons.

(6)

One and two-sided tests

The alternative in a one-sided test may be of the form HA : µX > µY.

In the first example this states that the diet on average causes weight loss.

In the second example this states that the average height of fathers is greater than the average height of their sons.

(7)

One and two-sided tests

The alternative may be of the form HA : µX < µY.

In the first example this states that the diet on average causes a gain in weight.

In the second example, this states that the average height of sons is greater than the average height of their fathers.

(8)

Testing procedure for 2 dependent samples

When the two samples are dependent, we calculate the differences Di = Xi − Yi.

We treat these differences as one sample and carry out the appropriate one sample test.

Let µD be the population mean of this difference. We have µD = µX − µY.

(9)

Testing procedure for 2 dependent samples

The null hypothesis

H0: µX = µY

corresponds to

H0 : µD = 0.

The alternatives

HA: µX 6= µY; HA: µX > µY; HA: µX < µY correspond to

HA : µD 6= 0; HA : µD > 0; HA : µD < 0, respectively.

(10)

Testing procedure for 2 dependent samples

Suppose a company promoting a diet stated that on average a person loses 4kg on this diet, this hypothesis is

H0 : µX − µY = 4 ⇒ H0: µD = 4.

The alternative in this case would be that the diet is not as effective as the company states i.e.

HA : µX − µY < 4 ⇒ HA : µD < 4.

(11)

Testing procedure for 2 dependent samples

The test statistic for the null hypothesis H0: µD = µ0 is T = D − µ0

S .E .(D),

where D is the mean of the sample of differences.

This is simply the statistic used for a one-sample test for a population mean.

If the sample size is large (n > 30), then this statistic has approximately a standard normal distribution.

It should be noted that if the sample size is small, then this test assumes that the differences come from a normal distribution. If this condition is satisfied, then the test statistic has a Student t-distribution with n − 1 degrees of freedom.

(12)

Example 3.4.1

8 athletes run 400m both at sea level and at altitude. Their times are given below.

Test the hypothesis that altitude does not affect the average times of runners at a significance level of 5%.

Runner 1 2 3 4 5 6 7 8

Sea Level 45.3 45.8 45.2 45.6 45.1 46.2 45.8 45.4 Altitude 45.2 45.5 45.4 45.4 45.0 45.6 45.5 45.2

(13)

Example 3.4.1

Since two times are given for each athlete, these samples are dependent.

We calculate the differences

Di = Xi − Yi,

where Xi and Yi are the times of the i -th athlete at sea level and altitude, respectively.

The sample of differences is given by

Runner 1 2 3 4 5 6 7 8

Sea Level 45.3 45.8 45.2 45.6 45.1 46.2 45.8 45.4 Altitude 45.2 45.5 45.4 45.4 45.0 45.6 45.5 45.2 Difference 0.1 0.3 -0.2 0.2 0.1 0.6 0.3 0.2

(14)

Example 3.4.1

i) The hypotheses are

H0: µD = 0 against HA : µD 6= 0.

ii) The test statistic is

T = D − µ0 S .E .(D),

where µ0 is the mean difference according to the null hypothesis, here µ0 = 0.

(15)

Example 3.4.1

iii) We calculate the realisation of the test statistic D=0.1 + 0.3 + . . . + 0.2

8 = 0.2

s2= 1 n − 1

n

X

i =1

(Di− D)2

=(0.1 − 0.2)2+ (0.3 − 0.2)2+ . . . + (0.2 − 0.2)2

7 ≈ 0.051

s=√

0.051 ≈ 0.227.

Hence,

S .E .(D)≈ s

√n = 0.227

√8 ≈ 0.0802 t=0.2 − 0

0.0802 ≈ 2.49

(16)

Example 3.4.1

iv) Since the sample size is small, if these differences come from a normal distribution, this statistic has a Student t-distribution with 7 degrees of freedom.

The test is two sided, thus the critical value is tn−1,α/2= t7,0.025 = 2.365.

v) Since |t| = 2.49 > tn−1,α/2 = 2.365, we reject H0. We conclude that altitude affects the runners times.

(17)

Use of duality for two-sided tests

We can also use the duality between confidence intervals and two sided tests.

In this case since the significance level is 5%, we calculate a 95%

confidence interval for the mean difference.

Since the samples are dependent, we treat the differences as one sample and use the appropriate formula to calculate a confidence interval for the true mean difference (in the appropriate

population, e.g. all those on the given diet).

If this interval contains the value from the null hypothesis, then we do not reject H0.

(18)

Use of duality for two-sided tests

Since the sample is small, this formula is

D ± tn−1,α/2S .E .(D) = D ± stn−1,α/2

√n This gives

0.2 ± 0.0802 × 2.365 = 0.2 ± 0.19 = [0.01, 0.39]

We are testing the hypothesis H0: µD = 0. Since 0 does not belong to this confidence interval, we reject H0 at a significance level of 5%.

We have evidence that altitude affects runners times.

(19)

Assumptions of this test

It should be noted that this test assumes that the differences come from a normal distribution.

It is not clear whether this is satisfied. Since, the realisation of the test statistic is close to the critical value, we should be somewhat sceptical of our conclusion (more data should be collected).

(20)

3.4.2 Tests for the difference between two population means: independent samples

In this case the two samples come from two unrelated populations.

e.g. the height of Americans and Irish, the times of two different groups of runners.

We consider two cases

1. Large sample tests (both samples have at least 30 observations).

2. Small sample tests.

(21)

Large Samples

Assume that we have samples (X1, . . . , Xm) and (Y1, . . . , Yn) from populations with population means µX and µY (where m and n are at least 30).

We use the difference between the two sample means, X − Y , to estimate the difference between the two population means, µX − µY.

The standard error of this estimate is S .E .(X − Y ) =

s σ2X

m +σY2 n . This is approximated using

S .E .(X − Y ) ≈ s

sX2 m +sY2

n .

(22)

Large Samples

Suppose we wish to test the hypothesis that H0: µX − µY = d

i.e. the difference between the two population means is d . In two-tailed tests the alternative is

HA : µX − µY 6= d .

If the test is two-tailed, we can always label the two samples in such a way that the alternative is

HA: µX − µY > d

(23)

Large Samples

When both samples are large (m, n > 30), the test statistic is Z = (X − Y ) − d

S .E .(X − Y )

This statistic has approximately a standard normal distribution.

Critical values and p-values are calculated in the same way as in one sample tests.

i.e. The p-value for a two sided test is p = P(|Z | > |t|) = 2P(Z > |t|).

The p-value for a one sided test is p = P(Z > t).

(24)

Large samples

The critical value for a two sided test is Zα/2= t∞,α/2. H0 is rejected if and only if |t| > t∞,α/2.

The critical value for a one sided test is Zα = t∞,α. H0 is rejected if and only if t > t∞,α.

It should be noted that, as before, the realisation of the test statistic is a measure of the distance between the data and H0. e.g. when the difference between the sample means is much greater than d , then the realisation of the test statistic will be much greater than 0.

(25)

Example 3.4.2

The average height of 100 Dutch men is 176cm and their standard deviation 12cm.

The average height of 50 Japanese men is 169cm and their standard deviation is 10cm.

Test at a significance level of 1% the hypothesis that the average heights of Dutch men and Japanese men are equal.

(26)

Example 3.4.2

i) We have

H0: µX − µY = 0; HA : µX − µY 6= 0,

where µX is the mean height of all Dutch men and µY the mean height of all Japanese men.

(27)

Example 3.4.2

ii) We use the test statistic

Z = X − Y S .E .(X − Y ) S .E .(X − Y )≈

s sX2

m +sY2 n

= r122

100+ 102 50 =√

3.44 ≈ 1.8547 The realisation of this test statistic is

t = 176 − 169

1.8547 ≈ 3.77.

(28)

Example 3.4.2

iv) From the table for the standard normal distribution, the p-value for this test is

p = 2P(Z > 3.77) = 0.0002.

v) Since p < α = 0.01, we reject H0 at a significance level of 1%.

Also, since p < 0.001, we have very strong evidence that the mean heights of Dutchmen and Japanese differ.

From the data we may state that Dutchmen are taller on average than Japanese men.

(29)

Example 3.4.2

iv) We can also use the appropriate critical value. Since both samples are large and this is a two sided test, this value is given by

Zα/2= t∞,α/2= 2.576.

v) Since |t| = 3.77 > t∞,α/2 = 2.576, we reject H0 at a significance level of 1%.

We have strong evidence that the mean heights of Dutchmen and Japanese differ.

(30)

Duality for two independent samples

When both samples are large, the 100(1 − α)% confidence interval for the difference between two population means is

(X − Y ) ± t∞,α/2S .E .(X − Y )

We can use the duality between confidence intervals and two sided tests.

(31)

Example 3.4.3

Calculate a 95% confidence interval for the difference between the mean height of Dutch and Japanese men.

Test the hypothesis that on average Dutch men are 10cm taller than Japanese men (data from previous example).

We are testing

H0 : µX − µY = 10 against HA : µX − µY 6= 10.

(32)

Example 3.4.3

Since the samples are large, the confidence interval for the difference between the population means is

(X − Y )±t∞,α/2S .E .(X − Y ) t∞,α/2=t∞,0.025= 1.96 Hence, the confidence interval is

(176 − 169) ± 1.96 × 1.8547 = 7 ± 3.64 = [3.36, 10.64]

Since 10 ∈ [3.36, 10.64], we do not reject H0 at a significance level of 5%.

The is no evidence against the hypothesis that on average Dutchmen are 10cm taller than Japanese men.

(33)

Small sample tests

In the case where at least one of the samples is small, the test for the difference between two population means assumes that the observations come from normal distributions with equal variances (i.e. σ2X = σ2Y = σ2).

(34)

Test for equality of variances

Before we carry out the test for a difference between two

population means, we should carry out an F test for the equality of two variances. We test

H0 : σX2 = σY2 against HA : σ2X 6= σ2Y

The test statistic, F , is the ratio between the two sample variances.

F = max{sX2, sY2} min{sX2, sY2}.

When this ratio is close to one we do not reject the null hypothesis that the population variances are equal.

Ratios much greater than 1 indicate that the null hypothesis is not true.

(35)

Test for equality of variances

Suppose the observations in both samples come from a normal distribution.

F has an F distribution with j − 1 and k − 1 degrees of freedom, where j and k are the number of observations in the sample with the largest and smallest variance, respectively.

We reject H0 if and only if the realisation of the test statistic, f , satisfies

f > Fj −1,k−1,α/2, where P(F > Fj −1,k−1,α/2= α/2).

(36)

Test for equality of variances

Critical values of the Fj −1,k−1,p are given in Table 9. j − 1 and k − 1 correspond to the column and row number, respectively.

This test is normally carried out at a significance level of 5%.

Each cell contains 4 critical values. The first is for p = 0.05, the second (in brackets) for p = 0.025 (this is the appropriate value), the third for p = 0.01 and the fourth for p = 0.001.

Note that if the two variances are not equal, the assumptions of the test for a difference between two means, presented below, do not hold. The appropriate procedure in this case is not covered in the course.

(37)

Test for difference between two means (small samples)

Given the hypothesis regarding the equality of variances was not rejected, we use a pooled estimate of the variances, sp2, where

sp2 = (m − 1)sX2 + (n − 1)sY2

m + n − 2 .

This is a weighted average of the sample variances, in which the sample with the largest number of observations has the largest weight.

The standard error of the difference between the sample means is S .E .(X − Y ) = sp

r1 m +1

n

(38)

Test for difference between two means (small samples)

Suppose we wish to test the null hypothesis H0 : µX − µY = d . The test statistic used is

T = (X − Y ) − d S .E .(X − Y ).

Given the assumptions of the test are satisfied (the observations come from normal distributions with a common variance), then this statistic has a student t distribution with m + n − 2 degrees of freedom.

A 100(1 − α)% confidence interval for the difference between the two population means, µX − µY, is given by

(X − Y ) ± t S .E .(X − Y ).

(39)

Test for difference between two means (small samples)

The critical value for the two sided test with HA : µX − µY 6= d . is tm+n−2,α/2.

We reject H0 iff |t| > tm+n−2,α/2.

If the test is two sided, we can always label the two samples such that the alternative is of the form

HA : µX − µY > d . The critical value for such a test is tm+n−2,α. We reject H0 iff t > tm+n−2,α.

(40)

Example 3.4.4

The average height of 13 Dutch men is 176cm and their standard deviation 12cm.

The average height of 11 Japanese men is 169cm and their standard deviation is 10cm.

Test at a significance level of 5% the hypothesis that the average heights of Dutch men and Japanese men are equal.

(41)

Example 3.4.4

Since the sample sizes are small, we first test the assumption that the population variances are equal. i) We have

H0 : σX2 = σY2 against HA : σ2X 6= σ2Y ii) The test statistic is

F = max{sX2, sY2} min{sX2, sY2}. iii) The realisation of this test statistic is

f = max{122, 102}

min{122, 102} = 1.44

(42)

Example 3.4.4

iv) We read the appropriate critical value. Since there are 13 observations in the sample with the greatest variance, j − 1 = 12.

Similarly, k − 1 = 10.

Since α = 0.05, the critical value is F12,10,0.025 = 3.62.

v) Since f < F12,10,0.025 = 3.62, we do not reject H0. Hence, we may assume that the two population variances are equal.

(43)

Example 3.4.4

We now proceed to test the hypothesis regarding the equality of the two means. i) We have

H0: µX − µY = 0 against HA : µX − µY 6= 0.

ii) The test statistic for this test is

T = X − Y

S .E .(X − Y ), where

S .E .(X − Y )=sp r1

m +1 n

sp2=(m − 1)sX2 + (n − 1)sY2

m + n − 2 .

(44)

Example 3.4.4

iii) We calculate the realisation of the test statistic. The pooled variance is

sp2 = (13 − 1) × 122+ (11 − 1) × 102

12 + 10 = 124.

The standard error of the difference between the two sample means is

S .E .(X − Y )=sp

r1 m +1

n

=124 r 1

13+ 1

11 ≈ 4.56.

The realisation of the test statistic is t = 176 − 169

≈ 1.53.

(45)

Example 3.4.4

iv) The critical value for the test is

tm+n−2,α/2= t22,0.025 = 2.074.

v) Since t = 1.53 < tm+n−2,α/2= 2.074, we do not reject H0. There is no evidence that the mean height of Dutchmen differs from the mean height of Japanese.

(46)

Use of duality

It should be noted that we can carry out this test using the duality between two sided tests and confidence intervals.

The formula for a 100(1 − α)% confidence interval for the difference between two population means when the sample sizes are small is

(X − Y ) ± tm+n−2,α/2S .E .(X − Y ).

Since we are carrying out a test at a significance level of 5%, we calculate a 95% confidence interval. This is given by

(176 − 169) ± t22,0.025S .E .(X − Y )=7 ± 2.074 × 4.56

=7 ± 9.46 = [−2.46, 16.46]

(47)

Use of duality

Since we are testing

H0 : µX − µY = 0,

We reject the null hypothesis if and only if 0 does not belong to this confidence interval.

Since 0 belongs to this confidence interval, we do not reject H0 at a significance level of 5%.

There is no evidence that the mean height of Dutchmen differs from the mean height of Japanese.

(48)

Confidence intervals for the difference between two population proportions

Suppose we have two independent, large samples from distinct populations.

Suppose the i -th sample has ni observations and the number of individuals showing the trait of interest in the i -th sample is xi. Let the proportion of individuals exhibiting these traits in the i -th population be pi and the proportion of individuals exhibiting these traits in the i -th sample be ˆpi, where

ˆ pi = xi

ni.

The difference between the two sample proportions is used to estimate the difference between the two population proportions.

(49)

Confidence intervals for the difference between two population proportions

The standard error of the difference between the two sample proportions is

S .E .(ˆp1− ˆp2) = s

p1(1 − p1)

n1 + p2(1 − p2) n2

As before, the standard error of this difference depends on the (unknown) population proportions.

When we calculate a confidence interval for the difference between population proportions, this standard error can be approximated using

S .E .(ˆp1− ˆp2) = s

ˆ

p1(1 − ˆp1)

n1 +pˆ2(1 − ˆp2) n2 .

(50)

Confidence intervals for the difference between two population proportions

An approximate 100(1 − α)% confidence interval for the difference between two population proportions is given by

(ˆp1− ˆp2) ± tm+n−2,α/2S .E .(ˆp1− ˆp2)

Note that this procedure should not be used e.g. to calculate a confidence interval for the difference between the level of support of two political parties in a single population.

(51)

Example 3.4.5

120 of 300 male applicants for an engineering course were accepted and 40 of 80 female applicants.

Calculate a 95% confidence interval for the difference in the proportion of males and females accepted for the course.

(52)

Example 3.4.5

The sample proportions are ˆ

p1 = 120

300 = 0.4; ˆp2= 40 80 = 0.5

The estimate of the standard error of the difference between the two proportions is given by

S .E .(ˆp1− ˆp2)=

s ˆ

p1(1 − ˆp1) n1

+ˆp2(1 − ˆp2) n2

=

r0.4 × 0.6

300 + 0.5 × 0.5

80 ≈ 0.063

(53)

Example 3.4.5

The confidence interval for the difference between the two proportions is

(ˆp1− ˆp2) ± t∞,α/2S .E .(ˆp1− ˆp2)=0.1 ± 1.96 × 0.063

=0.1 ± 0.123 = [−0.023, 0.223].

(54)

Hypothesis testing

Suppose we want to test the hypothesis H0: p1 = p2. The test statistic for this test is

Z = (ˆp1− ˆp2) S .E .(ˆp1− ˆp2).

This statistic has approximately a standard normal distribution.

Let t be the realisation of this statistic.

(55)

Hypothesis testing

In order to estimate the standard error of the difference between the two sample proportions under the null hypothesis, we calculate the pooled proportion p.

This is the total number of individuals with the trait in both samples divided by the total number of individuals in both samples

p = x1+ x2 n1+ n2

. We have

S .E .(ˆp1− ˆp2) ≈ s

p(1 − p) 1 n1

+ 1 n2

 .

(56)

Hypothesis testing

For two tailed tests with

HA : p1 6= p2 the critical value is Zα/2= t∞,α/2. We reject H0 if and only if |t| > t∞,α/2. The p-value is 2P(Z > |t|).

(57)

Hypothesis testing

In the case of one tailed tests, we can always number the samples such that the alternative is

HA : p1 > p2. The critical value is Zα = t∞,α.

We reject H0 if and only if t > t∞,α. The p-value is P(Z > t).

(58)

Example 3.4.6

120 of 300 male applicants for an engineering course were accepted and 40 of 80 female applicants.

Test at a significance level of 5% the hypothesis that the proportion of males accepted equals the proportion of females accepted.

(59)

Example 3.4.6

The hypotheses are

H0: p1 = p2 against HA : p16= p2 ii) The statistic used is

Z = pˆ1− ˆp2

S .E .(ˆp1− ˆp2).

(60)

Example 3.4.6

iii) We calculate the realisation of the test statistic. The pooled proportion is

p = x1+ x2

n1+ n2 = 120 + 40

300 + 80 ≈ 0.421.

The estimate of the standard error of the difference between the two sample proportions under H0 is

S .E .(ˆp1− ˆp2)≈

s

p(1 − p) 1 n1

+ 1 n2



= s

0.421 × 0.579

 1 300+ 1

80



≈ 0.062

(61)

Example 3.4.6

The sample proportions are ˆ

p1 = 120

300 = 0.4; ˆp2 = 40 80 = 0.5.

The realisation of the test statistic is t = 0.4 − 0.5

0.062 ≈ −1.61 iv) The critical value for this test is t∞,0.025= 1.96.

v) Since |t| = 1.61 < t∞,0.025 = 1.96, we do not reject H0 at a significance level of 5%.

There is no evidence that the admission rates vary according to sex.

(62)

Example 3.4.6

In iv) we could calculate the p-value

p = 2P(Z > |t|) = 2P(Z > 1.61) = 2 × 0.0537 = 0.1074.

Since p > 0.05, we do not reject H0.

There is no evidence that the admission rates vary.

(63)

Example 3.4.6

As in the one sample case, we can use the duality between two-sided hypothesis tests and confidence intervals.

Since different approximations are used to estimate the standard error of the difference between the sample proportions, this duality is only approximate.

Suppose we wish to test H0 : p1 = p2 (i.e. p1 − p2 = 0).

In this example the 95% confidence interval for the difference between the two proportions was [-0.023, 0.223].

Since 0 belongs to this confidence interval, we do not reject H0 at a significance level of approximately 5%. There is no evidence that the admission rates vary.

References

Related documents

It was noted by participants that Stratford, Newham and wider East London communities have always been socially engaged. Furthermore, the communities have been evolving to

Real Internet Malware Sample Configuration Analysis Results Internal logs Traffic Log Emulated Internet Filtering Rules Sandbox Access Controller OS Image Config Analysis

- For the management board of the bidder company as well, it can make sense to voluntarily make reference to any fairness opinion obtained in connection with a takeover bid,

It had been so long since I ‗had to‘ produce new musical material, having reached the point of creative activity desiring distraction or unconsciousness more than distracting

No member of Dicker Data Ltd warrants that such Forward Statements will be achieved or will prove to be correct or gives any warranty, express or implied, as to the accuracy,

APPENDIX 9 - Records Retention Schedule for State Government Agencies Revised: 4/08 Concerns about making records available to a public records request should be addressed to the

All specifications include controls for gender, age, month of birth, the number of students in the school, three dummy variables for the mother’s educational level, three

Pada skenario ini jika robot yang bertugas untuk mencari dan berjalan menuju node posisi tujuan menemui halangan berupa halangan statis ataupun halangan dinamis maka