Techniques of Statistical
Analysis I
Lect_5: CI to compare two groups
Bruno Arpino
Point estimate of the difference between two
population means and two population proportions
Confidence interval estimation for the difference
between two population means and two population
Outline
We want to compare the average number of hours
women and men spend watching TV (difference
between two means)
We want to compare the proportion of “very happy”
Examples
Goal: Form a confidence interval for the difference
between two population means, µ
x– µ
yWe assume to have two independent samples
:
sample from one population is drawn independently
CI for the difference between two means
sample from one population is drawn independently
from a sample selected from the other population
The point estimate is the difference between the two
sample means:
Three cases
(we are not going to see the formulas)CI for the difference between two means
5
They differ because of the way the margin of error is
calculated.
Assumption: normal distribution in both
experiment that randomly assigned 64 Univ. of Utah students to cell phone group or control group (32 each). Driving
simulating machine flashed red or green at irregular periods. Instructions: Press brake pedal as soon as possible when
CI for the difference btw 2 means: example
Instructions: Press brake pedal as soon as possible when detect red light.
See http://www.psych.utah.edu/AppliedCognitionLab/
Cell phone group: Carried out conversation about a political issue with someone in separate room (mean = 585.2milliseconds)
CI for the difference between two means:
example (cont’d)
51.5
533.7
-585.2
y
x
−
=
=
95% CI was: (12, 91) We are 95% confident that the difference between the mean reaction time in the cell-phone groupe is between 12 and 91 milliseconds higher than in the control group51.5
533.7
-585.2
y
Goal: Form a confidence interval for the difference
between two population means,
π
X
–
π
YWe assume to have two independent and
CI for the difference between two
proportions
We assume to have two independent and
large samples
The point estimate is the difference between the two
sample proportions:
College Alcohol Study conducted by Harvard School of Public Health (http://www.hsph.harvard.edu/cas/)
Is there a trend over time in percentage of binge drinking
(consumption of 5 or more drinks in a row for men and 4 or more for
CI for the difference between two
proportions: example
9
(consumption of 5 or more drinks in a row for men and 4 or more for women, at least once in past two weeks) and activities influenced by it?
“Have you engaged in unplanned sexual activities because of drinking alcohol?”
Data:
CI for the difference between two
proportions: example (cont’d)
p
X– p
Y= 0.021
We can be 95% confident that the population proportionsaying “yes” was between 0.01 larger and 0.03 larger in 2001 than in 1993.
When 0 is not in the CI, we can conclude that one population proportion is higher than the other.The GSS survey also collects data on “number of close
friends”.
On the 486 sampled females the mean number of close
friend was 8.3, while on the sample of 354 males the
mean was 8.9
Exercise 1
mean was 8.9
Knowing that the Margin of Error for a confidence level of
95% was 2.1, calculate the corresponding 95% CI.
Interpret the estimated CI.
What would happen to the CI margin of error if the
sample sizes for female and males were, respectively 650
and 890 (all other things being equal)?
Exercise 1 (cont’d)
6
.
0
8.3
-9
.
8
y
x
−
=
=
1.5, 2.7)
We can be 95% confident that the population mean number of close friends for males is between 1.5 less and 2.7 more than population mean number of close friends for females. Order is arbitrary. The 95% CI comparing mean for females – males is (-2.7, 1.5)What would happen to the CI margin of error if the
sample sizes for female and males were, respectively 650
and 890 (all other things being equal)?
For the case of CI on one population we saw that the bigger the
Exercise 1 (cont’d)
For the case of CI on one population we saw that the bigger the sample the smaller the margin of error. Here, the bigger are the two samples the smaller will be the ME (and width)
And what if the confidence level was set to 99%?
Similarly to what we said for the CI for a single population, the
In a survey, 26 of 50 men and 28 of 40 women had an
earned college degree.
Calculate a point estimate for the difference between the
proportion of women and men with college degree.
Exercise 2
proportion of women and men with college degree.
Knowing that the 90% CI for the difference between the
two proportion is (0.01, 0.35), calculate the margin of
error.
Point estimates for the two population proportions are:
And for their difference:
Exercise 2 (cont’d)
0.52
50
26
p
M=
=
0.70
40
28
p
F=
=
15
And for their difference:
The ME = W/2 = (0.35-0.01) / 2 = 0.17 (quite high!)
We are 90% confident that the difference between the
percentage of college graduated among women and men
is btw 1 and 35 percentage points.
0.18
0.52
-0.70
p
If something is not clear
(or you find mistakes in the slides)
do not hesitate to come at office hours
or e-mail me