Activity 2.3: More with Fathom - Chapter 2. Comparing Two Proportions: Randomization Methods

Let’s once again look at the results of the first day of class statistics student survey available in class_survey.ftm. Now we will revisit the results from this survey using tests of significance. As part of the survey, students were asked their gender, whether they had ever purchased textbooks online and whether they had ever been in a car accident while driving. In this activity we will explore whether purchasing textbooks online and being in a car accident while driving are associated with gender. Your instructor will provide you with the Fathom dataset needed to complete the following activity.

1. Let’s first explore whether there is a relationship between gender and buying textbooks online. Note: Earlier in this chapter Activity 2.2B has specific Fathom instructions for creating cross-tabulation tables and scrambling, so instructions here assume you know and have read those instructions.

a) Is this an observational study or an experiment? Why?

b) Assume that the population of interest is all Hope students. Do you have a random sample of the population? Do you think the sample is representative of the population? Why or why not?

c) Fill in the 2x2 cross-tabulation table below.

Ever bought a textbook online? Total

Gender Yes No

Female %

( / )

% ( / )

Male %

( / )

% ( / ) Total

d) What is the difference (female minus male) in percentages in your sample?

Based on this, do you think that there is a difference, in the population, in the percentage of all Hope males and all Hope females who have ever bought a textbook online?

e) Explain what impact the lack of representativeness of your sample may have on your conclusion in (d). Specifically, how might the percent of females who’ve ever bought a textbook online be different in our sample than in the population?

Why? How might the percent of males who’ve ever bought a textbook online be different in our sample than in the population? Why? What might be the impact on the difference in percentages? Why?

One sided vs. two-sided alternative hypotheses

At this point in the course, we’ve only considered tests of significance with a one-sided alternative hypothesis. Consider the swimming with dolphins study. The alternative hypothesis was:

Ha: People who swim with dolphins are more likely to improve on their depression symptoms

This is a one-sided alternative hypothesis because we are only looking people who swim with dolphins to be more likely to improve on their depression symptoms. The two-sided version of this alternative hypothesis would be:

Ha: People who swim with dolphins will improve on their depression symptoms at a different proportion than people who don’t swim with dolphins

In other words we’re just looking for any effect of swimming with dolphins on depression symptoms: positive or negative.

Key Idea: A two-sided alternative hypothesis looks for differences that are in either direction from the null hypothesis.

How do you carry out a two-sided hypothesis test?

Until now, when we found the p-value for a one-sided test, we looked at the number of outcomes that were as extreme or more extreme than what we observed. For example, in the dolphin study we looked to see how many times we observed a difference in proportions of at least 47% between the two groups when we scrambled. For a two-sided test we look at the opposite “tail” of the scramble distribution an equal distance away from the middle. For the dolphin study (and, in general, most studies), the middle of the scramble distribution is at zero, and so we would say not only how many times did we get at least 47%, but, also, how many times did we get less than or equal to -47%.

Figure 2.2 Shows that we are looking at both tails of the distribution when we compute the p-value. In this case, there are 13 times when we get a difference of 47% or greater and 15 times when we get a difference of -47% or less. Thus, the p-value is

13+15=28/1000=0.028. Because scramble distributions are typically symmetric, the two-sided test p-value is generally two-times as big as the p-value from the one-two-sided test.

Figure 2.2. Illustrating a two-sided p-value for the Dolphins study

50 100 150 200 250

diffprops

-0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8

Measures from Scrambled Dolphins Histogram

Why would you use a two-sided test instead of one-sided?

Two-sided tests are typically used more commonly in practice than one-sided tests. One reason for this is that two-sided tests are more conservative than one-sided tests. In other words, we typically get p-values that are about twice are large with a two-sided test and, thus, we have less evidence to reject the null hypothesis. Since it is harder to reject the null hypothesis we say two-sided tests are conservative. Another reason for using two-sided tests is so that the researcher is not biased to the results. For example, the researchers in the toy study could have used a two-sided test in case it turned out that, in fact, children preferred the hinderer toy. Because of issues with Type I and Type II errors (see Chapter 3) the decision to use a one- or two-sided alternative should be made prior to analyzing the data and based on a theoretical rationale and prior research.

f) State the null and alternative hypotheses comparing the proportion of males and females who have ever bought a textbook online. Use a two-sided alternative hypothesis. What does it mean to use a two-sided instead of a one-sided alternative?

g) Use the Fathom instructions above to make a dotplot of the difference in percentages from 1000 scrambles. What is the shape of the dotplot of the difference in percentages? What is its center? Why does it make sense for the center to be where it is?

Creating a measure in order to find a p-value using Fathom In the dolphins activity, we had created a measure already. But, in general, this is something you will need to do yourself.

1. You need to create the appropriate measure for the hypothesis test (measure is the term Fathom uses to describe a summary statistic). In this case, the

measure we are interested in is the difference in the percentages of students who have ever bought a textbook online, comparing females to males.

2. To create this measure in Fathom, double click on your collection to open the Inspector.

3. Click on the Measures tab. In the Measure column click on <new> and name your new measure “percentfemale” (for percentage of females who bought textbooks online.)

4. Double Click in the “Formula” box.

TECHNICAL NOTES: You probably want to drag the edges of your formula dialog box to make the window bigger; otherwise you will run off the side of the screen. Also, Fathom will, automatically add in closing parentheses and the second set of quotation marks, so type slowly and carefully!

5. Type the following:

proportion(text=”Yes”, gender=”Female”)

Then Click OK. You should see the percentage of females who bought textbooks online appear in the “value” box. Confirm that this matches what you got in part c.

6. Do steps 3-6 again, but now for the males, creating a new measure and naming it “percentmale”

7. Create a third measure which will be the difference in the two percentages.

Name the measure “percentdiff” and for the formula type:

percentfemale-percentmale. Again, confirm that the value you get corresponds to your answer to part c.

8. Now that your measure is created you can follow the directions given in Activity 3.1B to scramble the dataset and collect measures. Use 1000 scrambles.

h) What is the p-value for your test?

i) What is your conclusion about whether males or females have bought textbooks online in different proportions? Make sure you relate your conclusion back to the population of interest (see your answer to part b).

2. Now, let’s explore whether there is a relationship between gender and getting in a car accident while driving.

a) Fill in the 2x2 cross-tabulation table below.

Ever been in a car accident while driving?

Total

Gender Yes No

Female %

( / )

% ( / )

Male %

( / )

% ( / ) Total

b) What is the difference in percentages in the sample? Based on this, do you think that there is a difference, in the population, in the percentage of all Hope males and all Hope females who have ever been in a car accident while driving?

c) State the null and alternative hypothesis. Use a two-sided alternative hypothesis.

d) Create an appropriate measure and scramble the data. Collect measures and then create a dotplot of the difference in percentages. What is its center? Why does it make sense for the center to be where it is? Note: You’ll need to create measures and scramble to answer this question.

e) What is the p-value for your test?

f) What is your conclusion about whether different percentages of males and females have gotten in a car accident while driving? Make sure you relate your conclusion back to the population of interest.

g) Explain what impact the lack of representativeness of your sample may have on your conclusion in (d). Specifically, how might the percent of females who’ve ever gotten in a car-accident be different in our sample than in the population?

Why? How might the percent of males who’ve ever gotten in a car accident be different in our sample than in the population? Why? What might be the impact on the difference in percentages? Why?

In document Chapter 2. Comparing Two Proportions: Randomization Methods (Page 27-33)