A Decision-Making Approach

(1)

Business Statistics:

A Decision-Making Approach

6

^th

Edition

Estimation and Hypothesis Testing

for Two Population Parameters

(2)

Section Goals

After completing this section, you should be able to:



Test hypotheses or form interval estimates for



two independent population means



Standard deviations known



Standard deviations unknown



two means from paired samples



the difference between two population

proportions

(3)

Estimation for Two Populations

Estimating two population values

Population means, independent

samples

Paired samples

Population proportions

Group 1 vs. Same group Proportion 1 vs.

Examples:

(4)

Difference Between Two Means

Population means, independent

samples

σ

₁

and σ

₂

known

σ

₁

and σ

₂

unknown, n

₁

and n

₂

 30

σ

₁

and σ

₂

unknown, n

₁

or n

₂

< 30

Goal: Form a confidence interval for the difference

between two population means, μ

₁

– μ

₂

The point estimate for the difference is

x ₁ – x ₂

*

(5)

Independent Samples

Population means, independent

samples

σ

₁

and σ

₂

known

σ

₁

and σ

₂

unknown, n

₁

and n

₂

 30



Different data sources



Unrelated



Independent



Sample selected from one population has no effect on the sample selected from the other population



Use the difference between 2 sample means

*

(6)

Population means, independent

samples

σ

₁

and σ

₂

known

σ

₁

and σ

₂

unknown, n

₁

and n

₂

 30

σ

₁

and σ

₂

unknown, n

₁

or n

₂

< 30

σ ₁ and σ ₂ known

Assumptions:

 Samples are randomly and independently drawn

 population distributions are normal or both sample sizes are  30

 Population standard deviations are known

*

(7)

Population means, independent

samples

σ

₁

and σ

₂

known

σ

₁

and σ

₂

unknown, n

₁

and n

₂

 30

…and the standard error of x

₁

– x

₂

is

When σ

₁

and σ

₂

are known and both populations are normal or both sample sizes are at least 30, the test statistic is a z-value…

2 2 2

1

σ

σ  σ 

(continued)

σ ₁ and σ ₂ known

*

(8)

Population means, independent

samples

σ

₁

and σ

₂

known

σ

₁

and σ

₂

unknown, n

₁

and n

₂

 30

σ

₁

and σ

₂

unknown, n

₁

or n

₂

< 30

 

2 22 1

12 2 /2

1

n

σ n

x σ

x   z

_



The confidence interval for

μ

₁

– μ

₂

is:

σ ₁ and σ ₂ known

(continued)

*

(9)

σ ₁ and σ ₂ known

(continued)

Ghana Clinic Company (GCC) operates 20 medical clinics in

Ghana. As part of the company’s ongoing efforts to examine

its customer service, GCC observed 100 male and 100 female

patients at the clinics to estimate the difference in mean time

spent per visit for men and women patients. The sample

means were 34.5 minutes for males and 42.4 minutes for

females. Previous studies indicate that the standard deviation

is 11 minutes for males and 16 minutes for females. Help

GCC to develop a 95% confidence interval estimate for the

difference in mean times.

(10)

σ ₁ and σ ₂ known

(continued)

The measure of interest is μ

₁

-μ

₂

and the point estimate is x

₁

– x

₂

= 34.5 – 42.4 = – 7.9 minutes.

Thus, women in the sample spent an average of 7.9 minutes longer at the clinics.

9416 .

100 1 6 1 100

1 1 n

σ n

σ σ

2 2

2 2 2 1

2 1 x

x1_ 2

    

96 .

025 1

. 0

/2  z 

z _

(11)

σ ₁ and σ ₂ known

(continued)

 

100 6 1 100

1 96 1

. 1 9 . n 7

σ n

x σ x

2 2

2 22 1

12 2 /2

1

  z

_

    

0944 .

4 )

( 7056

. 11

8056 .

3 9

. 7

2 1   











u u

CI

Thus, based on the sample data and the a 95% confidence

(12)

Population means, independent

samples

σ

₁

and σ

₂

known

σ

₁

and σ

₂

unknown, n

₁

and n

₂

 30

σ

₁

and σ

₂

unknown, n

₁

or n

₂

< 30

σ ₁ and σ ₂ unknown, large samples

Assumptions:

 Samples are randomly and independently drawn

 both sample sizes are  30

 Population standard

deviations are unknown

*

(13)

Population means, independent

samples

σ

₁

and σ

₂

known

σ

₁

and σ

₂

unknown, n

₁

and n

₂

 30

σ ₁ and σ ₂ unknown, large samples

Forming interval estimates:

 use sample standard

deviation s to estimate σ

 the test statistic is a z value

(continued)

*

(14)

Population means, independent

samples

σ

₁

and σ

₂

known

σ

₁

and σ

₂

unknown, n

₁

and n

₂

 30

σ

₁

and σ

₂

unknown, n

₁

or n

₂

< 30

 

2 2 2 1

2 1 2 /2

1

n

s n

z s x

x  

_



The confidence interval for

μ

₁

– μ

₂

is:

σ ₁ and σ ₂ unknown, large samples

(continued)

*

(15)

Population means, independent

samples

σ

₁

and σ

₂

known

σ

₁

and σ

₂

unknown, n

₁

and n

₂

 30

σ ₁ and σ ₂ unknown, small samples

Assumptions:

 populations are normally distributed

 the populations have equal variances

 samples are independent

(16)

Population means, independent

samples

σ

₁

and σ

₂

known

σ

₁

and σ

₂

unknown, n

₁

and n

₂

 30

σ

₁

and σ

₂

unknown, n

₁

or n

₂

< 30

σ ₁ and σ ₂ unknown, small samples

Forming interval estimates:

 The population variances are assumed equal, so use the two sample standard deviations and pool them to estimate σ

 the test statistic is a t value with (n

₁

+ n

₂

– 2) degrees of freedom

(continued)

*

(17)

Population means, independent

samples

σ

₁

and σ

₂

known

σ

₁

and σ

₂

unknown, n

₁

and n

₂

 30

σ ₁ and σ ₂ unknown, small samples

The pooled standard deviation is

(continued)

   

2 n

n

s 1 n

s 1 s n

2 1

2 2 2

2 1 1

p

 





 

(18)

Population means, independent

samples

σ

₁

and σ

₂

known

σ

₁

and σ

₂

unknown, n

₁

and n

₂

 30

σ

₁

and σ

₂

unknown, n

₁

or n

₂

< 30

 

2 1

p 2 /2

1

n

1 n

s 1 t

x

x  

_



The confidence interval for μ

₁

– μ

₂

is:

σ ₁ and σ ₂ unknown, small samples

(continued)

Where t_/2 has (n₁ + n₂ – 2) d.f.,

and

   

2 n

n

s 1 n

s 1 s n

2 1

2 2 2

2 1 1

p  





 

*

(19)

σ ₁ and σ ₂ unknown, small samples

(continued)

A recent study was conducted to estimate the difference in mean annual contributions for individuals covered by two pension plan (SSNIT1 and SSNIT2) during the past year. A simple random sample of 15 people was selected from the population who have SSNIT plans with the following sample results:

SSNIT1 SSNIT2

n1 = 15 n2 = 15

x1bar = Ghc 2,119.70 x2bar = Ghc 1,777.70

s1 = Ghc 709.70 s2 = Ghc 593.90

(20)

σ ₁ and σ ₂ unknown, small samples

(continued)

       

2 5 1 5 1

9 . 93 5 1 5 1 7

. 09 7 1 5 1 2

n n

s 1 n s

1 s n

2 2

2 1

22 2 2

1

p 1  





 









 

At a 95% confidence level and n

₁

-n

₂

-2 = 15+15-2=28 d.f, t = 2.0484

  ³⁴² ⁴⁸⁹ ^. ⁴⁵

15 1 15

37 1 . 654

* 0484 .

2 70

. 777 1

70 . 119

2     

The confidence interval for μ

₁

– μ

₂

is:

Thus, the 95% confidence interval estimates for the difference in mean (Ghc) for people who contribute to SSNIT1 versus those who contribute to SSNIT2 is:

-Ghc147.45 ≤ u1- u2 ≤Ghc831.45

(21)

σ ₁ and σ ₂ unknown, small samples

(continued)

The head of research and dev’t. at Ernest Chemist is interested in estimating the difference between individuals age 50 and under and those over 50 years old with respect to the mean time from when a patient takes a new medication until the medication can be detected in the blood. The company would want to use this information to guide doctors in how to advice their patients ti use the new medication

50 or younger Over 50 years

n1 = 6 n2 = 8

x1bar = 13.6 minutes x2bar = 11.2 minutes

(22)

σ ₁ and σ ₂ unknown, small samples

(continued)

       

31 . 2 4

8 6

0 . 5 1 8 1

. 3 1 6 2

n n

s 1 n

s 1 s n

2 2

2 1

22 2 2

1

p 1 









 









 

At a 95% confidence level and n

₁

-n

₂

-2 = 6+8-2=12 d.f, t = 2.1788

  ² ^. ⁴ ⁵ ^. ⁰⁷¹⁵

8 1 6 31 1 . 4

* 1788 .

2 2

. 11 6

.

13     

The confidence interval for μ

₁

– μ

₂

is:

(23)

σ ₁ and σ ₂ unknown, small samples

Thus, the 95% confidence interval estimates for the difference in mean time spent taken for medicine to be detected in the blood between the age group 50 or younger and over 50 is:

-2.6715≤ u1- u2 ≤7.4715.

Since the interval includes zero, the researcher cannot conclude that a difference exist between the age groups wrt. the mean

time needed for the medication to be detected in the blood.

Thus, it does not seem to matter whether the patient is 50 or

younger or over 50. Thus, no advice for doctors is warranted.

(24)

Paired Samples

Tests Means of 2 Related Populations



Paired or matched samples



Repeated measures (before/after)



Use difference between paired values:



Eliminates Variation Among Subjects



Assumptions:



Both Populations Are Normally Distributed



Or, if Not Normal, use large samples Paired

samples

d = x

₁

- x

₂

(25)

Paired Differences

The i

^th

paired difference is d

_i

, where

Paired

samples d

_i

= x

_1i

- x

_2i

The point estimate for the population mean paired difference is d :

) d (d

n

2 i





n d d

n

1 i



i



The sample standard

deviation is

(26)

Paired Differences

The confidence interval for d is

Paired samples

1 n

) d (d

s

n

1 i

2 i

d



 



n t s

d  _ _/2 ^d

Where t

_/2

has

n - 1 d.f. and s

_d

is:

(continued)

n is the number of pairs in the paired sample

(27)

Paired Differences

(continued)

The IT manager of the Ghana Ports and Harbours Authority is trying to convince the chief executive officer that recently

released cargo tracking and payment software by TECHSAVE instruments is faster than that of the existing one by ORASOFT.

Five workers were randomly selected to use both the

TECHSAVE and ORASOFT softwares for cargo tracking and payment. The table below shows the time (in minutes) spent on both softwares.

TECHSAVE 58 52 64 68 57

ORASOFT 47 53 54 60 58

(28)

Hypothesis Tests for the Difference Between Two Means



Testing Hypotheses about μ ₁ – μ ₂



Use the same situations discussed already:



Standard deviations known or unknown



Sample sizes  30 or not  30

(29)

Hypothesis Tests for

Two Population Proportions

Lower tail test:

H

₀

: μ

₁

 μ

₂

H

_A

: μ

₁

< μ

₂

i.e.,

H

₀

: μ

₁

– μ

₂

 0 H

_A

: μ

₁

– μ

₂

< 0

Upper tail test:

H

₀

: μ

₁

≤ μ

₂

H

_A

: μ

₁

> μ

₂

i.e.,

H

₀

: μ

₁

– μ

₂

≤ 0 H

_A

: μ

₁

– μ

₂

> 0

Two-tailed test:

H

₀

: μ

₁

= μ

₂

H

_A

: μ

₁

≠ μ

₂

i.e.,

H

₀

: μ

₁

– μ

₂

= 0

H

_A

: μ

₁

– μ

₂

≠ 0

Two Population Means, Independent Samples

(30)

Hypothesis tests for μ ₁ – μ ₂

Population means, independent samples σ

₁

and σ

₂

known

σ

₁

and σ

₂

unknown, n

₁

and n

₂

 30

σ

₁

and σ

₂

unknown, n

₁

or n

₂

< 30

Use a z test statistic

Use s to estimate unknown σ , approximate with a z

test statistic

Use s to estimate unknown

σ , use a t test statistic and

pooled standard deviation

(31)

Population means, independent

samples

σ

₁

and σ

₂

known

σ

₁

and σ

₂

unknown, n

₁

and n

₂

 30

  ^ ^

2 2 2 1

2 1

2 2 1

1

n σ n

σ

μ μ

x z x





 

The test statistic for

μ

₁

– μ

₂

is:

σ ₁ and σ ₂ known

*

(32)

Population means, independent

samples

σ

₁

and σ

₂

known

σ

₁

and σ

₂

unknown, n

₁

and n

₂

 30

σ

₁

and σ

₂

unknown, n

₁

or n

₂

< 30

σ ₁ and σ ₂ unknown, large samples

The test statistic for

μ

₁

– μ

₂

is:

  ^ ^

2 2 2 1

2 1

2 2 1

1

n s n

s

μ μ

x z x





 

*

(33)

Population means, independent

samples

σ

₁

and σ

₂

known

σ

₁

and σ

₂

unknown, n

₁

and n

₂

 30

σ ₁ and σ ₂ unknown, small samples

Where t_/2 has (n₁ + n₂ – 2) d.f.,

  ^ ^

2 1

p

2 2 1

1

n 1 n

s 1

μ x μ

z x





 

The test statistic for

μ

₁

– μ

₂

is:

(34)

Two Population Means, Independent Samples Lower tail test:

H

₀

: μ

₁

– μ

₂

 0 H

_A

: μ

₁

– μ

₂

< 0

Upper tail test:

H

₀

: μ

₁

– μ

₂

≤ 0 H

_A

: μ

₁

– μ

₂

> 0

Two-tailed test:

H

₀

: μ

₁

– μ

₂

= 0 H

_A

: μ

₁

– μ

₂

≠ 0

  ^/2  ^/2



-z

_

z

_

-z

_/2

z

_/2

Reject H₀ if z < -z_ Reject H₀ if z > z_ Reject H₀ if z < -z_/2 or z > z_/2

Hypothesis tests for μ ₁ – μ ₂

(35)

Pooled s _p t Test: Example

You’re a financial analyst for a brokerage firm. Is there a difference in dividend yield between stocks listed on the NYSE & NASDAQ? You collect the following data:

NYSE NASDAQ

Number 21 25 Sample mean 3.27 2.53 Sample std dev 1.30 1.16 Assuming equal variances, is

there a difference in average

(36)

Calculating the Test Statistic

       

1.2256 2

25 21

1.16 1

25 1.30

1 21 2

n n

s 1 n

s 1 s n

2 2

2 1

2 2 2

2 1 1

p 









 









 

  ^ ^{ } ^ _2.040

25 1 21

1.2256 1

0 2.53

3.27 n 1 n

s 1

μ μ

x z x

2 1

p

2 2 1

1







 





 

The test statistic is:

(37)

Solution

H

₀

: μ

₁

- μ

₂

= 0 i.e. (μ

₁

= μ

₂

) H

_A

: μ

₁

- μ

₂

≠ 0 i.e. (μ

₁

≠ μ

₂

)

 = 0.05

df = 21 + 25 - 2 = 44

Critical Values: t = ± 2.0154

Test Statistic: Decision:

Conclusion:

Reject H

₀

at  = 0.05 t

0

^2.0154 -2.0154

.025

Reject H₀ Reject H₀

.025

2.040 040 .

1 2 2256 1

. 1

53 . 2 27

.

3 



 

z

(38)

The test statistic for d is

Paired samples

1 n

) d (d

s

n

1 i

2 i

d



 



n s

μ t d

d



d



Where t

_/2

has n - 1 d.f.

and s

_d

is:

n is the number of pairs in the paired sample

Hypothesis Testing for

Paired Samples

(39)

Lower tail test:

H

₀

: μ

_d

 0 H

_A

: μ

_d

< 0

Upper tail test:

H

₀

: μ

_d

≤ 0 H

_A

: μ

_d

> 0

Two-tailed test:

H

₀

: μ

_d

= 0 H

_A

: μ

_d

≠ 0 Paired Samples

Hypothesis Testing for Paired Samples

  ^/2  ^/2



-t

_

t

_

-t

_/2

t

_/2

(continued)

(40)



Assume you send your salespeople to a “customer service” training workshop. Is the training effective?

You collect the following data:

Paired Samples Example

Number of Complaints: (2) - (1)

Salesperson Before (1) After (2) Difference, d_i

C.B. 6 4 - 2

T.F. 20 6 -14

M.H. 3 2 - 1

R.K. 0 0 0

M.O. 4 0 - 4

-21

d =  ^d

i

n

5.67 1 n

) d s (d

2 i

d





  

= -4.2

(41)



Has the training made a difference in the number of complaints (at the 0.01 level)?

- 4.2 d =

H

₀

: μ

_d

= 0 H

_A

: μ

_d

 0

Test Statistic:

Critical Value = ± 4.604

d.f. = n - 1 = 4

Reject

/2

- 4.604 4.604

Decision: Do not reject H

₀

(t stat is not in the reject region)

Paired Samples: Solution

Reject

/2

- 1.66

 = .01

(42)

Two Population Proportions

Goal: Form a confidence interval for or test a hypothesis about the

difference between two population proportions, p

₁

– p

₂

The point estimate for

the difference is p ₁ – p ₂

Population proportions

Assumptions:

n

₁

p

₁

 5 , n

₁

(1-p

₁

)  5

n

₂

p

₂

 5 , n

₂

(1-p

₂

)  5

(43)

Confidence Interval for

Two Population Proportions

Population proportions

 

2

2 2

1

1 1

/2 2

1

n

) p (1

p n

) p (1

z p p

p 

 





_

The confidence interval for

p

₁

– p

₂

is:

(44)

Hypothesis Tests for

Two Population Proportions

Population proportions Lower tail test:

H

₀

: p

₁

 p

₂

H

_A

: p

₁

< p

₂

i.e.,

H

₀

: p

₁

– p

₂

 0 H

_A

: p

₁

– p

₂

< 0

Upper tail test:

H

₀

: p

₁

≤ p

₂

H

_A

: p

₁

> p

₂

i.e.,

H

₀

: p

₁

– p

₂

≤ 0 H

_A

: p

₁

– p

₂

> 0

Two-tailed test:

H

₀

: p

₁

= p

₂

H

_A

: p

₁

≠ p

₂

i.e.,

H

₀

: p

₁

– p

₂

= 0

H

_A

: p

₁

– p

₂

≠ 0

(45)

Two Population Proportions

Population proportions

2 1

2 2

1 1

n n

x x

n n

p n

p p n



 



 

The pooled estimate for the overall proportion is:

Since we begin by assuming the null

hypothesis is true, we assume p

₁

= p

₂

and pool the two p estimates

(46)

Two Population Proportions

Population proportions

  ^ ^

 

 



 



 

2 1

n 1 n

) 1 p 1

( p

p p

p z p

The test statistic for

p

₁

– p

₂

is:

(continued)

(47)

Hypothesis Tests for

Two Population Proportions

Population proportions Lower tail test:

H

₀

: p

₁

– p

₂

 0 H

_A

: p

₁

– p

₂

< 0

Upper tail test:

H

₀

: p

₁

– p

₂

≤ 0 H

_A

: p

₁

– p

₂

> 0

Two-tailed test:

H

₀

: p

₁

– p

₂

= 0 H

_A

: p

₁

– p

₂

≠ 0

  ^/2  ^/2



-z

_

z

_

-z

_/2

z

_/2

(48)

Example:

Two population Proportions

Is there a significant difference between the proportion of men and the proportion of

women who will vote Yes on Proposition A?



In a random sample, 36 of 72 men and 31 of 50 women indicated they would vote Yes



Test at the .05 level of significance

(49)



The hypothesis test is:

H

₀

: p

₁

– p

₂

= 0

(the two proportions are equal)

H

_A

: p

₁

– p

₂

≠ 0

(there is a significant difference between proportions)



The sample proportions are:

 Men: p₁ = 36/72 = .50

 Women: p₂ = 31/50 = .62

67 31

36 x

x     



 The pooled estimate for the overall proportion is:

Example:

Two population Proportions

(continued)

(50)

The test statistic for

p₁ – p₂

is:

Example:

Two population Proportions

(continued)

.025

-1.96 1.96

.025

-1.31

Decision: Do not reject H

₀

Conclusion: There is not significant evidence of a difference in proportions who will vote yes between

  ^ ^

   

1.31 50

1 72

.549) 1 (1

.549

0 .62

.50

n 1 n

p) 1 (1

p

p p

p z p

2 1







 



 



 



 



 



 

Reject H₀ Reject H₀

Critical Values = ±1.96

For  = .05

(51)

Section Summary



Compared two independent samples

 Formed confidence intervals for the differences between two means

 Performed Z test for the differences in two means

 Performed t test for the differences in two means



Compared two related samples (paired samples)

 Formed confidence intervals for the paired difference

 Performed paired sample t tests for the mean difference



Compared two population proportions

 Formed confidence intervals for the difference between two

A Decision-Making Approach

Business Statistics: