Lecture_5 [Modo de compatibilidad]

(1)

Techniques of Statistical

Analysis I

Lect_5: CI to compare two groups

Bruno Arpino

(2)

Point estimate of the difference between two

population means and two population proportions

Confidence interval estimation for the difference

between two population means and two population

Outline

(3)

We want to compare the average number of hours

women and men spend watching TV (difference

between two means)

We want to compare the proportion of “very happy”

Examples

(4)

Goal: Form a confidence interval for the difference

between two population means, µ

_x

– µ

_y

We assume to have two independent samples

:

sample from one population is drawn independently

CI for the difference between two means

sample from one population is drawn independently

from a sample selected from the other population

The point estimate is the difference between the two

sample means:

(5)

Three cases

(we are not going to see the formulas)

CI for the difference between two means

5

They differ because of the way the margin of error is

calculated.

Assumption: normal distribution in both

(6)

Does cell phone use while driving impair reaction times?

Article in Psych. Science (2001, p. 462) describes an

experiment that randomly assigned 64 Univ. of Utah students to cell phone group or control group (32 each). Driving

simulating machine flashed red or green at irregular periods. Instructions: Press brake pedal as soon as possible when

CI for the difference btw 2 means: example

Instructions: Press brake pedal as soon as possible when detect red light.

See http://www.psych.utah.edu/AppliedCognitionLab/

Cell phone group: Carried out conversation about a political issue with someone in separate room (mean = 585.2

milliseconds)

(7)

Point estimate of the difference between the two reaction times:

CI for the difference between two means:

example (cont’d)

51.5

533.7 -585.2

y

x

−

=

95% CI was: (12, 91)

We are 95% confident that the difference between the mean reaction time in the cell-phone groupe is between 12 and 91 milliseconds higher than in the control group

51.5

533.7 -585.2

y

(8)

Goal: Form a confidence interval for the difference

between two population means,

π

X

–

π

Y

We assume to have two independent and

CI for the difference between two

proportions

We assume to have two independent and

large samples

The point estimate is the difference between the two

sample proportions:

(9)

College Alcohol Study conducted by Harvard School of Public Health (http://www.hsph.harvard.edu/cas/)

Is there a trend over time in percentage of binge drinking

(consumption of 5 or more drinks in a row for men and 4 or more for

CI for the difference between two

proportions: example

9

(consumption of 5 or more drinks in a row for men and 4 or more for women, at least once in past two weeks) and activities influenced by it?

“Have you engaged in unplanned sexual activities because of drinking alcohol?”

Data:

(10)

Point estimate:

95% CI for the change in the students saying “yes” was: (0.01, 0.03)

CI for the difference between two

proportions: example (cont’d)

p

_X

– p

_Y

= 0.021

We can be 95% confident that the population proportion

saying “yes” was between 0.01 larger and 0.03 larger in 2001 than in 1993.

When 0 is not in the CI, we can conclude that one population proportion is higher than the other.

(11)

The GSS survey also collects data on “number of close

friends”.

On the 486 sampled females the mean number of close

friend was 8.3, while on the sample of 354 males the

mean was 8.9

Exercise 1

mean was 8.9

Knowing that the Margin of Error for a confidence level of

95% was 2.1, calculate the corresponding 95% CI.

Interpret the estimated CI.

What would happen to the CI margin of error if the

sample sizes for female and males were, respectively 650

and 890 (all other things being equal)?

(12)

Margin of error = 2.1. Let use the general formula of a CI to calculate it:

CI = (point estimate ± margin of error) = (0.6 ± 2.1) = (-1.5, 2.7)

Exercise 1 (cont’d)

6 .

0

8.3 -9

.

8 y

x

−

=

1.5, 2.7)

We can be 95% confident that the population mean number of close friends for males is between 1.5 less and 2.7 more than population mean number of close friends for females. Order is arbitrary. The 95% CI comparing mean for females – males is (-2.7, 1.5)

(13)

What would happen to the CI margin of error if the

sample sizes for female and males were, respectively 650

and 890 (all other things being equal)?

For the case of CI on one population we saw that the bigger the

Exercise 1 (cont’d)

For the case of CI on one population we saw that the bigger the sample the smaller the margin of error. Here, the bigger are the two samples the smaller will be the ME (and width)

And what if the confidence level was set to 99%?

Similarly to what we said for the CI for a single population, the

(14)

In a survey, 26 of 50 men and 28 of 40 women had an

earned college degree.

Calculate a point estimate for the difference between the

proportion of women and men with college degree.

Exercise 2

proportion of women and men with college degree.

Knowing that the 90% CI for the difference between the

two proportion is (0.01, 0.35), calculate the margin of

error.

(15)

Point estimates for the two population proportions are:

And for their difference:

Exercise 2 (cont’d)

0.52

50

26 p

_M

=

0.70

40

28 p

_F

=

15

And for their difference:

The ME = W/2 = (0.35-0.01) / 2 = 0.17 (quite high!)

We are 90% confident that the difference between the

percentage of college graduated among women and men

is btw 1 and 35 percentage points.

0.18

0.52 -0.70

p

(16)

If something is not clear

(or you find mistakes in the slides)

do not hesitate to come at office hours

or e-mail me