• No results found

Collection and Analysis of Sample Data

In document Reading Statistics Huck (Page 167-171)

A Detailed Look at Each of the Six Steps

Step 4: Collection and Analysis of Sample Data

So far, we have covered Steps 1, 2, and 6 of the hypothesis testing procedure. In the first two steps, the researcher states the null and alternative hypotheses. In Step 6, the researcher either (1) rejects in favor of or (2) fails to reject We now turn our attention to the principal “stepping stone” used to move from the begin-ning points of the hypothesis testing procedure to the final decision.

Inasmuch as the hypothesis testing procedure is, by its very nature, an empir-ical strategy, it should come as no surprise that the researcher’s ultimate decision to reject or to retain is based on the collection and analysis of sample data. No crys-tal ball is used, no Ouija board is relied on, and no eloquent argumentation is per-mitted. Once and are fixed, only scientific evidence is allowed to affect the disposition of

The fundamental logic of the hypothesis testing procedure can now be laid bare because the connections between the data, and the final decision are as straightforward as what exists between the speed of a car, a traffic light at a busy intersection, and a lawful driver’s decision as the car approaches the intersection.

Just as the driver’s decision to stop or to pass through the intersection is made after observing the color of the traffic light, the researcher’s decision to reject or to re-tain is made after observing the sample data. To carry this analogy one step fur-ther, the researcher looks at the data and asks, “Is the empirical evidence inconsistent with what one would expect if were true?” If the answer to this question is yes, then the researcher has a green light and rejects However, if the data turn out to be consistent with then the data set serves as a red light telling the researcher not to discard

Because the logic of hypothesis testing is so important, let us briefly consider a hypothetical example. Suppose a valid intelligence test is given to a random sample

H0.

EXCERPTS 7.12–7.13

• Two-Tailed and One-Tailed Tests

All tests of significance were two-tailed.

Source: Miller, K. (2010). Using a computer-based risk assessment tool to identify risk for chemotherapy-induced febrile neutropenia. Clinical Journal of Oncology Nursing, 14(1), 87–91.

To investigate what variables might be important predictors of company support for fathers taking leave, [Pearson] correlations were calculated. . . . One-tailed tests of significance were used.

Source: Haas, L., & Hwang, P. C. (2010). Is fatherhood becoming more visible at work?

Trends in corporate support for fathers taking parental leave in Sweden. Fathering: A Journal of Theory, Research, & Practice about Men as Fathers, 7(3), 303–321.

of 100 males and a random sample of 100 females attending the same university. If the null hypothesis was first set up to say and if the data reveal that the two sample means (of IQ scores) differ by only two-tenths of a point, the sample data would be consistent with what we expect to happen when two samples are selected from populations having identical means. Clearly, the notion of sam-pling error could fully explain why the two Ms might differ by two-tenths of an IQ point even if In this situation, there is no empirical justification for making the data-based claim that males at our hypothetical university have a dif-ferent IQ, on average, than do their female classmates.

Now, let’s consider what would happen if the difference between the two sam-ple means turns out to be equal to 20 IQ points. If the empirical evidence turns out this way, we have a situation where the data are inconsistent with what one would expect if were true. Although the concept of sampling error strongly suggests that neither sample mean will turn out exactly equal to its population parameter, the difference of

20 IQ points between and and

are equal. With results such as this, the researcher would reject the arbitrarily selected null hypothesis.

To drive home the point I am trying to make about the way the sample data influence the researcher’s decision concerning let’s shift our attention to a real study that had Pearson’s correlation as its statistical focus. In Excerpt 7.14, the hy-pothesis testing procedure was used to evaluate three bivariate correlations based on data that came from 90 men who had surgery after going to an infertility clinic.

Each man was measured in terms of the number of left and right spermatic arteries as well the number of left and right lymphatic channels. Then, the left-right data were correlated for each of the two kinds of arteries and for the channels.

H0, mfemales

Mfemalesisquiteimprobableif,infact,mmales Mmales

H0

mmale = mfemale.

H0:mmale = mfemale

EXCERPT 7.14

• Rejecting H

0

When the Sample Data Are Inconsistent with H

0

An analysis of the relationship between the right and left spermatic cord anatomy in the bilateral varicocelectomy cases ( ) revealed a significant correlation between the number of right and left internal spermatic arteries (r 0.42, P .05).

However, we did not identify a significant correlation between the number of right and left external spermatic arteries ( ) or the number of right and

left lymphatic channels ( ).

Source: Libman, J. L., Segal, R., Baazeem, A., Boman, J., & Zini, A. (2010). Microanatomy of the left and right spermatic cords at subinguinal microsurgical varicocelectomy: Compara-tive study of primary and redo repairs. Urology, 75(6), 1324–1327.

r = 0.19,P 7 .05

r = 0.13,P 7 .05 n = 90

In the study associated with Excerpt 7.14, the hypothesis testing procedure was used separately to evaluate each of the three sample rs. In each case, the null hypothesis stated H0:r = 0.00.The sample data, once analyzed, yielded correlations

of .42, .13, and .19. The first of these rs ended up being quite different from the null hypothesis number of 0.00. Statistically speaking, the r of .42 was so inconsistent with that sampling error alone was considered to be an inadequate explanation for why the observed correlation was so far away from the pinpoint number in the null hypothesis. Although we expect some discrepancy between 0.00 and the data-based value of r even if were true, we do not expect this big difference. Ac-cordingly, the null hypothesis concerning the internal spermatic arteries—that there was no relationship between the number of left and right arteries—was rejected, as indicated by the phrase significant correlation and the notation P .05.

The second and third correlations in Excerpt 7.14 turned out to be much closer to the pinpoint number in The small differences between the null number and the rs of .13 and .19 could each be explained by sampling error. In other words, if the correlation in the population were truly equal to 0.00, it would not be sur-prising to have a sample r (with ) be anywhere between .20 and .20. Ac-cordingly, the null hypotheses concerning external spermatic arteries and the lymphatic channels were not rejected, as indicated by the notation P .05.

In Step 4 of the hypothesis testing procedure, the summary of the sample data always leads to a single numerical value. Being based on the data, this number is technically referred to as the calculated value (or the test statistic). Occasionally, the researcher’s task in obtaining the calculated value involves nothing more than computing a value that corresponds to the study’s statistical focus. This was the case in Excerpt 7.14, where the statistical focus was Pearson’s correlation coefficient and where the researcher needed to do nothing more than compute a value for r.

In most applications of the hypothesis testing procedure, the sample data are summarized in such a way that the statistical focus becomes hidden from view. For example, consider Excerpts 7.15 and 7.16. In the first of these excerpts, the calculated

n = 90 H0,0.00.

H0 H0

EXCERPTS 7.15–7.16

• The Calculated Value

[A] t-test found that overall satisfaction levels of male students ( ) were significantly higher than those of female students ( ),

= 13.78, p 0.05 (two-tailed).

Source: Kim, H., Lee, S., Goh, B., & Yuan, J. (2010). Assessing College Students’ Satisfac-tion with University Foodservice. Proceedings of the 15th Annual Graduate Student Research Conference in Hospitality and Tourism, Washington, DC, 34–46.

There was no difference in girls’ (M 5 years, 8 months; SD  1.52) and boys’

(M 5 years, 10 months; SD  1.68) ages, F(1, 114)  0.25, p 05.

Source: Tenenbaum, T. R., Hill, D. B., Joseph, N., & Roche, E. (2010). “It’s a boy because he’s painting a picture”: Age differences in children’s conventional and unconventional gen-der schemas. British Journal of Psychology, 101(1), 137–154.

t(225) M = 3.87,SD = 1.00

M = 4.14,SD = 1.10

value was labeled t and it turned out equal to 13.78. In Excerpt 7.16, the calculated value was F, and it was equal to 0.25. In each of these excerpts, the statistical focus was the mean.

In each of these excerpts, two sample means were compared. In Excerpt 7.15, the mean of 4.14 was compared against the mean of 3.87. In Excerpt 7.16, the means were 5 years, 8 months and 5 years, 10 months. Within each of these stud-ies, the researchers put their sample data into a formula that produced the calculated value. The important thing to notice in these excerpts is that in neither case does the calculated value equal the difference between the two means being compared. In Chapter 10, we consider t-tests and F-tests in more detail, so you should not worry now if you do not currently comprehend everything that is presented in these ex-cerpts. They are shown solely to illustrate the typical situation in which the statis-tical focus of a study is not reflected directly in the calculated value.

Before computers were invented, researchers always had a single goal in mind when they turned to Step 4 of the hypothesis testing procedure: the computation of the data-based calculated value. Now that computers are widely available, researchers still are interested in the magnitude of the calculated value derived from the data analysis. Contemporary researchers, however, are also interested in a second piece of information generated by the computer: the data-based p-value.

Whenever researchers use a computer to perform the data analysis, they ei-ther (1) tell the computer what the null hypothesis is going to be or (2) accept the computer’s built-in default version of The researcher also specifies whether is directional or nondirectional in nature. Once the computer knows what the re-searcher’s and are, it can easily analyze the sample data and compute the probability of having a data set that deviates as much or more from as does the data set being analyzed. The computer informs the researcher as to this probability by means of a statement that takes the form , with the blank being filled by a single decimal value somewhere between 0 and 1.

Excerpt 7.17 illustrates nicely how a p-value is like a calculated value in that either one can be used as a single-number summary of the sample data. As you can see, three Pearson correlation coefficients are in this excerpt. The researchers associated

p = ––––

EXCERPT 7.17

• Using p as the Calculated Value

Correlation analyses and inspection of scatterplots between the PA composite and speech production variables showed that there was no significant relationship be-tween PA and distortions ( ), nor between PA and typical sound changes ( ). However, a significant relationship was found between PA and atypical sound changes ( ).

Source: Preston, J., & Louise Edwards, M. (2010). Phonological awareness and types of sound errors in preschoolers with speech sound disorders. Journal of Speech, Language & Hearing Research, 53(1), 44–60.

r = -.362,p = .009 r = -.171,p = .273r = .129,p = .429

with this passage used a p-value to determine how likely it would be, assuming the null hypothesis to be true, to end up with a sample correlation as large or larger than each of their computed rs. Each p functioned as a measure of how inconsistent the sample data were compared with what would be expected to hap-pen if were true.

Be sure to note in Excerpt 7.17 that there is an inverse relationship between the size of p and the degree to which the sample data deviate from the null esis. The r that is furthest away from 0.00 (the pinpoint number in the null hypoth-esis) has the smallest p. In contrast, the smallest of the three rs has the largest p.

In document Reading Statistics Huck (Page 167-171)