A general assumption in fixed-effects ANOVA-ANCOVA is the homogeneity of scale of the random errors from level (group) to level (group). For two levels (samples), we discussed the Fligner–Killeen test for homogeneity of scale in Section 3.3. This test generalizes immediately to the k-level (sample) problem.
For discussion, we use the notation of the k-cell (sample) model in Sec-tion 5.2 for the one-way ANOVA design. For this secSec-tion, our full model is Model (5.1) except that the pdf of random errors for the jth level is of the form fj(x) = f [(x − θj)/σj]/σj, where θj is the median and σj > 0 is a scale parameter for the jth level, j = 1, . . . , k. A general hypothesis of interest is that the scale parameters are the same for each level, i.e.,
H0: σ1= · · · = σk versus HA: σj 6= σj′ for some j 6= j′. (5.24) Either this could be the hypothesis of interest or the hypothesis for a pre-test on homogeneous scales in the one-way ANOVA location model.
The Fligner–Killeen test of scale for two-samples, (3.30), easily generalizes to this situation. As in Section 3.3, define the folded-aligned sample as
Yij∗ = Yij− medj′{Yij′}, j = 1, . . . , ni: i = 1, . . . , k. (5.25) Let n =Pn
i=1nidenote the total sample size and let Rij = R|Yij∗| denote the rank of the absolute values of these items, from 1 to n. Define the scores a∗(i) as
Then the Fligner–Killeen test statistic is
QF K = Pnn − 1
An approximate level α-test12 is based on rejecting H0if QF K ≥ χ2α(k − 1).
As in Section 3.3, we can obtain rank-based estimates of the difference in scale. Let Zij = log(|Yi1∗|). and let ∆1i = log(σi/σ1), for i = 2, . . . , k, where, without loss of generality, we have referenced the first level. Using ∆11 = 0, we can write the log-linear model for the aligned, folded sample as
Zij = ∆∗1i+ eij, j = 1, . . . ni, i = 1, 2, . . . , k. (5.28) As discussed in Section 3.3, the scores defined in expression (3.29) are ap-propriate for the rank-based fit of this model. Recall that they are optimal, when the random errors of the original samples are normally distributed. As discussed in Section 3.3, exponentiation of the regression estimates leads to estimation (and confidence intervals) for the ratio of scales η1i= σi/σ1.
The function fkk.test is a wrapper which obtains the R fit and analysis for this section. We demonstrate it in the following example.
Example 5.7.1 (Three Generated Samples). For this example, we gener-ated three samples (rounded) from Laplace distributions. The samples have location and scale (5, 1), (10, 4), and (10, 8) respectively. Hence in the above notation, η21 = 4 and η31 = 8. A comparison boxplot of the three samples is shown in Figure 5.7.
The following code segment computes the Fligner–Killeen test for these three samples. Note the response variables are in the vector response and the vector indicator is a vector of group membership.
> fkk.test(response,indicator)
Table of estimates and 95 percent confidence intervals:
estimate ci.lower ci.upper xmatas.factor(iu)2 3.595758 0.9836632 13.14421 xmatas.factor(iu)3 8.445785 2.5236985 28.26458
Test statistic = 8.518047 p-value = 0.0141361
Hence, based on the results of the test, there is evidence to reject H0. The estimates of η21and η31are close to their true values. The respective confidence intervals are (0.98, 13.14) and (2.52, 28.26).
5.8 Exercises
5.8.1. Hollander and Wolfe (1999) report on a study of the length of YOY gizzard shad fish at four different sites of Kokosing Lake in the summer of 1984. The data are:
12See page 105 of H´ajek and ˇSid´ak (1967).
Site 1 Site 2 Site 3 Site 4
46 42 38 31
28 60 33 30
46 32 26 27
37 42 25 29
32 45 28 30
41 58 28 25
42 27 26 25
45 51 27 24
38 42 27 27
44 52 27 30
Let µi be the true mean length of YOY gizzard shad at site i.
(a) Use the rank-based Wilcoxon procedure, (5.7), to test the hypoth-esis of equal means.
(b) Based on Part (a), use Fisher’s least significance difference to per-form a multiple comparison on the differences in the means. As discussed in Hollander and Wolfe, YOY gizzard shad are eaten by
Sample 1 Sample 2 Sample 3
−10−50510152025
Response
FIGURE 5.7
Comparison boxplots of three samples.
game fish and for this purpose smaller fish are better. In this regard, based on the MCP analysis, which sites, if any, are preferred?
5.8.2. For the study discussed in Example 5.2.3, obtain the analysis based on the Wilcoxon test using FW, (5.7). Then obtain the MCP analysis us-ing Tukey’s method. Compare this analysis with the Kruskal–Wallis analysis presented in the example.
5.8.3. For the data of Example 5.6.1 determine the value of the Kruskal–
Wallis test and its p-value.
5.8.4. For a one-way design, instead of using the oneway.rfit function, the rank-based test based on FW, expression (5.7), can be computed by the Rfit functions rfit and drop.test.
(a) Suppose the R vector y contains the combined samples and the R vector ind contains the associated levels. Discuss why following two code segments are equivalent:
oneway.rfit(y,ind)and
fit<-rfit(y~factor(ind)); drop.test(fit) (b) Verify this equivalence for the data in Example 5.2.1.
5.8.5. Write an R script which compares empirically the power between the rank-based test based on FW and the corresponding LS test for the following situation: 4 samples of size 10; location centers 10, 11, 12, and 15; and the random errors 3 ∗ eij where the eij are iid with the common t-distribution having 3 degrees of freedom. Use a simulation size of 1000 and the level α = 0.05.
5.8.6. In Exercise 5.8.5, we simulated the empirical power of the rank-based and LS tests for a specific situation. For this exercise, check the validity of the rank-based and LS tests; i.e., set the location centers to be the same.
5.8.7. Suppose that we want a descriptive plot for a one-way design. Compar-ison boxplots are one such plot; however, if the level sample sizes niare small then these plots can be misleading (quartiles and, hence, lengths of boxplots, can be adversely affected by a few outliers). Hence, for ni ≤ 10, we recommend comparison dotplots instead of boxplots. Consider the following data from a one-way design.
Level 1 66 45 42 53 71 Level 2 38 53 47 23 42 50 Level 3 82 26 95 70 80 82 75 (a) Obtain the comparison dotplots for the above data.
(b) Compute the Fligner–Killeen test of equal scales for these data.
5.8.8. Miliken and Johnson (1984) discuss a study pertaining to an unbal-anced 2 × 3 crossed factorial design. For convenience, we present the data below. For their LS analysis, Milliken and Johnson recommend Type III hy-potheses. As discussed in Section 5.5, the rank-based analysis based on the Rfitfunction raov obtains tests based on Type III hypotheses. The R func-tion lm, however, does not. In this exercise, we show how to easily obtain LS analyses for Type III hypotheses using the functions redmod and cellx from Rfit. For the data, the factors are labeled T and B and the responses are tabled as:
B1 B2 B3
T1 19 24 22 20 26 25
21 25
T2 25 21 31 27 24 32 24 33
The code below assumes that the response and indicator vectors are:
resp = c(19, 20, 21, 24, 26, 22, 25, 25, 25, 27, 21, 24, 24, 31, 32, 33)
a = c(rep(1,8),rep(2,8))
b = c(1, 1, 1, 2, 2, 3, 3, 3, 1, 1, 2, 2, 2, 3, 3, 3)
(a) First obtain the analysis for interaction and main effects using the Rfitfunction raov. The hypotheses of this analysis are of Type III.
(b) The following script will obtain the LS Type III analysis for Factor T:
fitls <- lm(resp ~factor(a):factor(b)) cell <- rep(1:6,each=3)
cellmean <- cellx(cell) ha <- c(1,1,1,-1,-1,-1) xa <- redmod(cellmean,ha) lmred <- lm(resp ~ xa) anova(lmred,fitls)
Run this code and show that the LS test statistic computes to F = 30.857. Notice that this differs from the LS ANOVA based on fitls.
(c) Write code and run it for the LS Type III analysis of Factor B.
Hint: the hypothesis matrix hb has two rows.
(d) Write code and run it for the LS Type III analysis of interaction.
Hint: the hypothesis matrix hint has two rows. Notice that it agrees with the LS ANOVA based on fitls.
5.8.9. Page 436 of Hollander and Wolfe (1999), presents part of a study on the effects of cloud seeding on cyclones; see Wells and Wells (1967) for the original reference. For the reader’s convenience, the data are contained in the dataset SCUD. The first column is an indicator for Control (2) or Seeded (1);
column 2 is the predictor M , the geostrophic meridional circulation index;
and column 3 is the response RI which is a measure of precipitation.
(a) Obtain a scatterplot of RI versus M . Use different plotting symbols for the Control and Seeded. Add the rank-based fits of the linear models for each. Comment on the plot.
(b) Using a rank-based analysis, test for homogeneous slopes for the two groups.
(c) If homogeneous slopes is “accepted” in (b), use a rank-based analysis to test for homogeneous groups.
(d) The test in Part (c) is adjusted for M . Is this adjustment necessary?
Test at level α = 0.05.
5.8.10. Using a simulation study investigate the powers of the Jonckheere–
Terpstra test and the Kruskal–Wallis test for the following situation: Samples of size 10 from four normal populations each having variance 1 and with the respective means of µ1= 0, µ2 = 0.45, µ3= .90, and µ4= 1.0. Use the level of α = 0.05 and a simulation size of 10,000.
5.8.11. In reading through Section 5.6 on ordered alternatives, the reader may have noticed the simplicity of the test based on Spearman’s ρS over the test using the Jonckheere–Terpstra test statistic. Is it as powerful? As a partial answer, this exercise provides some empirical evidence. One may use cor.test to obtain a test based on Spearman’s ρ. See, for example, the following code.
group <- c(rep(1,ni),rep(2,ni),rep(3,ni),rep(4,ni)) y1 <- rnorm(ni,0,1);y2 <- rnorm(ni,.15,1);
y3 <- rnorm(ni,.35,1); y4 <- rnorm(ni,.55,1) y <- c(y1,y2,y3,y4)
cor.test(group,y,method=’spearman’,
continuity=FALSE,exact=FALSE,alternative=’less’)
(a) Determine the situation (distributions, alternative, etc.) which the above code simulates.
(b) Based on the above situation, run a simulation to compare the em-pirical powers of the Jonckheere–Terpstra test and Spearman’s ρ.
(c) Run a simulation where the error distribution is a t-distribution with 3 degrees of freedom.
(d) Run a simulation where the error distribution is a χ2-distribution with 1.5 degrees of freedom.
5.8.12. Besides simplicity, another advantage of the test based Spearman’s ρS
over the Jonckheere–Terpstra is that the estimate of ρS is an easily understood correlational measure. In this setting, it offers a measure of the “strength” of the relationship. Use the function cor.boot.ci to obtain a bootstrap confi-dence interval.
5.8.13. Consider the malignant melanoma data in Example 3.1.1. See if the association found there still holds after adjusting for latitude and longitude.