5. TESTS OF HYPOTHESES
5.5 Chapter 5 Problems
5.5
Chapter 5 Problems
1. The accident rate over a certain stretch of highway was about = 10 per year for a period of several years. In the most recent year, however, the number of accidents was 25. We want to know whether this many accidents is very probable if = 10; if not, we might conclude that the accident rate has increased for some reason. Investigate this question by assuming that the number of accidents in the current year follows a Poisson distribution with mean and then testing H0: = 10. Use the test statistic
D = max(0; Y 10) where Y represents the number of accidents in the most recent year.
2. A woman who claims to have special guessing abilities is given a test, as follows: a deck which contains …ve cards with the numbers 1 to 5 is shu- ed and a card drawn out of sight of the woman. The woman then guesses the card, the deck is reshu- ed with the card replaced, and the procedure is repeated several times.
(a) Let be the probability the woman guesses the card correctly and let Y be the number of correct guesses in n repetitions of the procedure. Discuss why Y Binomial(n; ) would be an appropriate model. If you wanted to test the hypothesis that the woman is guessing at random what is the appropriate null hypothesis H0 in terms of the parameter ?
(b) Suppose the woman guessed correctly 8 times in 20 repetitions. Calculate the p-value for your hypothesis H0 in (a) and give a conclusion about whether you
think the woman has any special guessing ability.
(c) In a longer sequence of 100 repetitions over two days, the woman guessed cor- rectly 32 times. Calculate the p-value for these data. What would you conclude now?
3. The R function runif () generates pseudo random U(0; 1) random variables. The command y runif(n) will produce a vector of n values y1; : : : ; yn.
(a) Give a test statistic which could be used to test that the yi’s, i = 1; : : : ; n are
consistent with a random sample from Uniform(0; 1). (b) Generate 1000 yi’s and carry out the test in (a).
4. A company that produces power systems for personal computers has to demonstrate a high degree of reliability for its systems. Because the systems are very reliable under normal use conditions, it is customary to ‘stress’the systems by running them at a considerably higher temperature than they would normally encounter, and to measure the time until the system fails. According to a contract with one personal computer manufacturer, the average time to failure for systems run at 70 C should be no less than 1; 000 hours.
174 5. TESTS OF HYPOTHESES
From one production lot, 20 power systems were put on test and observed until failure at 70 . The 20 failure times y1; : : : ; y20 were (in hours):
374:2 544:0 1113:9 509:4 1244:3 551:9 853:2 3391:2 297:0 63:1 250:2 678:1 379:6 1818:9 1191:1 162:8 1060:1 1501:4 332:2 2382:0 (Note: 20 P i=1
yi= 18; 698:6). Failure times Yi are known to be approximately Exponen-
tial with mean .
(a) Use a likelihood ratio test to test the hypothesis that = 1000 hours. Is there any evidence that the company’s power systems do not meet the contracted standard?
(b) If you were a personal computer manufacturer using these power systems, would you like the company to perform any other statistical analyses besides testing H0 : = 1000? Why?
5. The following data are instrumental measurements of level of dioxin (in parts per billion) in 20 samples of a “standard”water solution known to contain 45 ppb dioxin.
44:1 46:0 46:6 41:3 44:8 47:8 44:5 45:1 42:9 44:5 42:5 41:5 39:6 42:0 45:8 48:9 46:6 42:9 47:0 43:7
(a) Assuming that the measurements are independent and G( ; ), obtain a 95% con…dence interval for and test the hypothesis that = 45.
(b) Obtain a 95% con…dence interval for . Of what interest is this scienti…cally?
6. Radon is a colourless, odourless gas that is naturally released by rocks and soils and may concentrate in highly insulated houses. Because radon is slightly radioactive, there is some concern that it may be a health hazard. Radon detectors are sold to homeowners worried about this risk, but the detectors may be inaccurate. Univer- sity researchers placed 12 detectors in a chamber where they were exposed to 105 picocuries per liter of radon over 3 days. The readings given by the detectors were:
91:9 97:8 111:4 122:3 105:4 95:0 103:8 99:6 96:6 119:3 104:8 101:7
Let yi= reading for the i0th detector, i = 1; : : : ; 12. For these data 12 P i=1 yi= 1249:6 and 12 P i=1 (yi y)2= 971:43:
To analyze these data assume the model
Yi v N ; 2 = G ( ; ) ; i = 1; : : : ; 12 independently
5.5. CHAPTER 5 PROBLEMS 175
7. Data on the number of accidents at a busy intersection in Waterloo over the last 5 years indicated that the average number of accidents at the intersection was 3 acci- dents per week. After the installation of new tra¢ c signals the number of accidents per week for a 25 week period were recorded as follows:
4 5 0 4 2 0 1 4 1 3 1 1 2 2 2 1 1 3 2 3 2 0 2 2 3
Let yi = the number of accidents in week i; i = 1; 2; : : : ; 25: To analyse these data we
assume Yi has a Poisson distribution with mean ; i = 1; 2; : : : ; 25 independently.
(a) To decide whether the mean number of accidents at this intersection has changed after the installation of the new tra¢ c signals we wish to test the hypothesis H0 :
= 3: Why is the discrepancy measure D =
25
P
i=1
Yi 75 reasonable? Calculate
the exact p value for testing H0 : = 3. What would you conclude?
(b) Justify the following statement:
P pY
=n c
!
t P (Z c) where Z s N (0; 1) :
(c) Why is the discrepancy measure D = Y 3 reasonable for testing H0 : = 3?
Calculate the approximate p-value using (b). Compare this to the value in (a) : (d) Suppose that Y1; : : : ; Ynis a random sample from a Poisson( ) distribution. Show
that the likelihood ratio test statistic for testing H0: = 0 is
( 0) = 2n Y log
Y
0
+ 0 Y :
Use this test statistic for testing H0 : = 3 for the data above. Compare your
answer to the answers in (a) and (c).
8. In the Wintario lottery draw, six digit numbers were produced by six machines that operate independently and which each simulate a random selection from the digits 0; 1; : : : ; 9. Of 736 numbers drawn over a period from 1980-82, the following frequen- cies were observed for position 1 in the six digit numbers:
Digit (i): 0 1 2 3 4 5 6 7 8 9 Total
Frequency (fi): 70 75 63 59 81 92 75 100 63 58 736
Consider the 736 draws as trials in a Multinomial experiment and let
j = P (digit j is drawn on any trial); j = 0; 1; : : : 9:
If the machines operate in a truly “random” fashion, then we should have j = 0:1;
176 5. TESTS OF HYPOTHESES
(a) Test this hypothesis using a likelihood ratio test. What do you conclude? (b) The data above were for digits in the …rst position of the six digit Wintario
numbers. Suppose you were told that similar likelihood ratio tests had in fact been carried out for each of the six positions, and that position 1 had been singled out for presentation above because it gave the largest observed value of the likelihood ratio statistic . What would you now do to test the hypothesis
j = 0:1; j = 0; 1; 2; : : : ; 9? (Hint: Find P (largest of 6 independent ’s is ).)
9. Testing a genetic model: Recall the model for the M-N blood types of people, discussed in Examples 2:4:2 and 2:5:2. In a study involving a random sample of n persons the numbers Y1; Y2; Y3 (Y1+ Y2+ Y3 = n) who have blood types MM, MN
and NN respectively has a Multinomial distribution with joint probability function
f (y1; y2; y3) = n! y1!; y2!; y3! y1 1 y2 2 y3 3 for yj = 0; 1; : : : ; 3 P j=1 yj = n
and since 1+ 2+ 3 = 1 the parameter space = f( 1; 2; 3) : j 0; 3
P
j=1
pj = 1g
has dimension two. The genetic model discussed earlier speci…ed that 1; 2; 3 can
be expressed in terms of only a single parameter ; 0 < < 1, as follows:
1 = 2; 2 = 2 (1 ); 3 = (1 )2 (5.11)
Consider (5.11) as the hypothesis H0 to be tested. In this case, the dimension of
the parameter space for ( 1; 2; 3) under H0 is one, and the general methodology of
likelihood ratio tests can be applied. This gives a test of the adequacy of the genetic model.
Suppose that a sample with n = 100 persons gave observed values y1 = 18;
y2 = 50; y3= 32: Test the hypothesis (5.11) and state your conclusion.
10. The Poisson model is often used to compare rates of occurrence for certain types of events in di¤erent geographic regions. For example, consider K regions with popula- tions P1; : : : ; PK and let j, j = 1; : : : ; K be the annual expected number of events
per person for region j. By assuming that the number of events Yj for region j in a
given t-year period has a Poisson distribution with mean Pj jt, we can estimate and
compare the j’s or test that they are equal.
(a) Under what conditions might the stated Poisson model be reasonable?
(b) Suppose you observe values y1; : : : ; yK for a given t-year period. Describe how
to test the hypothesis that 1= 2= = K.
(c) The data below show the numbers of children yj born with “birth defects”for 5
5.5. CHAPTER 5 PROBLEMS 177
for each region. Test the hypothesis that the …ve rates of birth defects are equal.
Pj: 2025 1116 3210 1687 2840
yj: 27 18 41 29 31
11. Challenge Problem: Likelihood ratio test statistics for Gaussian model and unknown: Suppose that Y1; : : : ; Yn are independent G( ; ) observations.
(a) Show that the likelihood ratio test statistic for testing H0: = 0 ( unknown)
is given by
( 0) = n log 1 + T
2
n 1
where T = pn(Y 0)=S and S is the sample standard deviation. Note: you will want to use the identity
n P i=1 (Yi 0)2= n P i=1 (Yi Y )2+ n(Y 0)2:
(b) Show that the likelihood ratio test statistic for testing H0: = 0 ( unknown)
can be written as ( 0) = U n log (U=n) n where
U = (n 1)S 2 2 0 : See Example 5.4.4.