E XAMPLE 1.42: B AYESIAN NETWORK
22. Joint probability distribution
2.4 Testing Hypotheses about Outcomes of Experiments
2.4.6 Testing Hypotheses about Quantities Measured on Ordinal Scales
So far, we have tested hypotheses in which a variable takes on real values. We now consider a variable that takes on nominal (categorical) values, such as “UDP” or
“TCP,” or ordinal values, such as “bad,” “satisfactory,” and “good.” (These terms are defined in Section 2.1.2.) In such cases, hypothesis testing using the techniques described earlier is meaningless because a sample cannot be described by a mean;
nor can we define real confidence intervals about the mean. Instead, for such vari-ables, hypotheses are of the form
H0: the observed values are drawn from the expected distribution
Then, we use a statistical test, such as the Pearson chi-squared test, to reject or not reject the hypothesis, as described next.
EXAMPLE 2.11: HYPOTHESIS FORMULATION WITH NOMINAL SCALES
Suppose that you want to check whether the distribution of packet types on a link from your campus to the Internet is similar to that reported in the litera-ture. For instance, suppose that 42% of the bytes originating at the University of Waterloo (UW) during the measurement period can be attributed to P2P applications. Suppose that you measure 100 GB of traffic and find that 38 GB can be attributed to P2P applications. Then, a reasonable null hypothesis would be
H0: the observed traffic on the campus Internet access link is similar to that at UW
ptg7913109 How should we test hypotheses of this form? A clue comes from the following
thought experiment. Suppose that we have a possibly biased coin and that we want to determine whether it is biased. The null hypothesis is
H0: P(heads) = P(tails) = 0.5
We assume that we can toss the coin as many times as we want and that the out-come of each toss is independent. Let T denote the outout-come “Tails” and H denote the outcome “Heads.” We will represent a set of outcomes, such as “Nine heads and one tails” by the notation TH9. As we saw earlier, if a coin is unbiased, this outcome has the probability . Any outcome from n coin tosses—such as a Heads, represented by HaTn – a—can be viewed as one sample drawn at random from the set of all possible outcomes when tossing a coin n times. A little thought indicates that the probability of this outcome, given that the probability of heads is p and of tails is q = 1 – p, is given by the binomial distribution , which is also the ath term of the expansion of the expression (p + q)n. As , the bino-mial distribution tends to the normal distribution, so that the probability of each outcome is approximated by the normal distribution.
Now, consider an experiment in which each individual outcome is independent of the others, and an outcome results in one of k ordinal values, o1, o2,…, ok. Let the expected probability of the ith outcome be pi, so that the expected count for the ith outcome, ei=npi. Suppose that we run the experiment n times and that the ith outcome occurs ni times with . We can represent any particular outcome by
, and this outcome can be viewed as one sample drawn at random from the set of all possible outcomes. The probability of such an outcome is given by the multinomial distribution as
(EQ 2.24)
(EQ 2.25)
This outcome is one of the terms from the expansion of (p1+p2+...+pk)n. As with the binomial distribution, we can use the multinomial distribution to test whether any particular outcome, conditional on a null hypothesis on the pis being true, is
“too unlikely,” indicating that the null hypothesis should be rejected.
10
ptg7913109
2.4 Testing Hypotheses about Outcomes of Experiments 81
In many cases, using the multinomial distribution for testing the validity of a hypothesis can be cumbersome. Instead, we use a standard mathematical result that the variable , for values of ei> 5, closely approximates a standard normal variable with zero mean and unit variance. But we immediately run into a snag: The ni are not independent. For example, if n3 = n, all the other nimust be zero. Therefore, the Xiare also not independent. However, it can be proved that this set of k dependent variables Xi can be mapped to a set of k – 1 independent standard normal variables while keeping the sums of squares of the variables constant. By definition, the sum of squares of k – 1 independent standard normal variables fol-lows the (also written chi-squared and pronounced kai-squared) distribution with k – 1 degrees of freedom. Therefore, if the null hypothesis is true—that is, the observed quantities are drawn from the distribution specified implicitly by the expected values—the variable
(EQ 2.26)
is an variable with k – 1 degrees of freedom. Standard statistical tables tabulate P(X > a), where X is an variable with k degrees of freedom. We can use this table to compute the degree to which a set of observations corresponds to a set of expected values for these observations. This test is the Pearson test.
EXAMPLE 2.12: CHI-SQUARED TEST
We use the Pearson test to test whether the observation in Example 2.11 results in rejection of the null hypothesis. Denote P2P traffic by ordinal 1 and non-P2P traffic by ordinal 2. Then, e1 = 42, e2 = 58, n1 = 38, n2 = 62. Therefore, X = (38 – 42)2/42 + (62 – 58)2/58 = 0.65. From the table with 1 degree of freedom, we see that P(X > 3.84) = 0.05, so that any value greater than 3.84 occurs with probability less than 95% and is unlikely. Since 0.65 < 3.84, the observation is not unlikely, which means that we cannot reject the null hypothesis.
In contrast, suppose that the observation was n1 = 72, n2 = 28. Then, X = (72 – 42)2/42 + (28 – 58)2/58 = 36.9. Since 36.9 > 3.84, such an observation would suggest that we should reject the null hypothesis at the 5% level.
Xi ni–ei
ptg7913109