In this section, we return to the two-sample location problem, (3.23). We discuss the optimal score function in the advent that the user knows the form of the population distributions. In a first reading, this section can be skipped, but even a brief browsing serves as motivation for the next section on adaptive score selection schemes. Recently Doksum (2013) developed an optimality result for the Hodges–Lehmann estimate based on a rank-based likelihood.
3.5.1 Efficiency
In this subsection, we briefly discuss the robustness and efficiency of the rank-based estimates. This leads to a discussion of optimizing the analysis by the suitable choice of a score function.
When evaluating estimators, an essential property is their robustness. In
Section 2.5, we discussed the influence function of an estimator. Recall that it is a measure of the sensitivity of the estimator to an outlier. We say an estimator is robust if its influence function is bounded for all possible outliers.
Consider the rank-based estimator b∆ϕ based on the score function ϕ(u).
Recall that its standard deviation is τϕ/√
n; see expression (3.19). Then for an outlier at z, b∆ϕ has the influence function
IF(z; b∆ϕ) = −IX(z) where IX(z) is 1 or 0 depending on whether z is thought as being from the sample of Xs or Y s, respectively.3 The indicator IY(z) is defined similarly.
For a rank-based estimator, provided its score function ϕ(u) is bounded, its influence function is a bounded function of z; hence, the rank-based estimator is robust. There are a few scores used occasionally in practice, which have unbounded score functions. In particular, the normal scores, discussed in Sec-tion 3.5.1, belong to this category; although they are technically robust. In practice, though, generally scores with bounded score functions are used.
In contrast, the influence function for the LS estimator Y −X of b∆ is given by
which is an unbounded function of z. Hence, the LS estimator is not robust.
We next briefly consider relative efficiency of estimators of ∆ for the two-sample location model (3.23). Suppose we have two estimators, b∆1 and b∆2, such that, for i = 1, 2,√n( b∆i− ∆) converges in distribution to the N(0, σ2i) distribution. Then the asymptotic relative efficiency (ARE) between the esti-mators is the ratio of their asymptotic variances; i.e.,
ARE( b∆1, b∆2) = σ22
σ12. (3.41)
Note that b∆2 is more efficient than b∆1 if this ratio is less than 1. Provided that the estimators are location and scale equivariant, this ratio is invariant to location and scale transformations, so for its computation only the form of the pdf needs to be known. Because they minimize norms, all rank-based estimates and the LS estimate are location and scale equivariant. Also, all these estimators have an asymptotic variance of the form κ2{(1/n1)+(1/n2)}.
The scale parameter κ is often called the constant of proportionality. Hence, in the ARE ratio, the design part (the part in braces) of the asymptotic variance cancels out and the ratio simplifies to the ratio of the parameters of proportionality; i.e., the κ2s.
First, consider the ARE between a rank-based estimator and the LS esti-mator. Assume that the random errors in Model (3.23) have variance σ2 and
3See Chapter 2 of Hettmansperger and McKean (2011).
TABLE 3.2
AREs Among the Wilcoxon (W), Sign (S), and LS Estimators When the Errors Have a Contaminated Normal Distribution with σc= 3 and Proportion of Contamination ǫ
ǫ, (Proportion of Contamination)
0.00 0.01 0.02 0.03 0.05 0.10 0.15 0.25 ARE(W, LS) 0.955 1.009 1.060 1.108 1.196 1.373 1.497 1.616 ARE(S, LS) 0.637 0.678 0.719 0.758 0.833 0.998 1.134 1.326 ARE(W, S) 1.500 1.487 1.474 1.461 1.436 1.376 1.319 1.218
pdf f (t). Then, from the above discussion, the ARE between the rank-based estimator with score generating function ϕ(u) and the LS estimator is
ARE( b∆ϕ, b∆LS) = σ2
τϕ2, (3.42)
where τϕis defined in expression (3.19).
If the Wilcoxon scores are chosen, then τW = [√ 12R
f2(t) dt]−1. Hence, the ARE between the Wilcoxon and the LS estimators is
ARE( b∆W, b∆LS) = 12σ2
Z
f2(t) dt
2
. (3.43)
If the error distribution is normal then this ratio is 0.955; see, for example, Hettmansperger and McKean (2011). Thus, the Wilcoxon estimator loses less than 5% efficiency to the LS estimator if the errors have a normal distribution.
In general, the Wilcoxon estimator has a substantial gain in efficiency over the LS estimator for error distributions with heavier tails than the normal dis-tribution. To see this, consider a family of contaminated normal distributions with cdfs
F (x) = (1 − ǫ)Φ(x) + ǫΦ(x/σc), (3.44) where 0 < ǫ < 0.5 and σc > 1. If we are sampling from this cdf, then (1 − ǫ)100% of the time we are sampling from a N (0, 1) distribution while ǫ100%
of the time we are sampling from a heavier tailed N (0, σc2) distribution. For illustration, consider a contaminated normal with σc = 3. In the first row of Table 3.2 are the AREs between the Wilcoxon and the LS estimators for a sequence of increasing contamination. Note that even if ǫ = 0.01, i.e., only 1%
contamination, the Wilcoxon is more efficient than LS.
If sign scores are selected, then τS = [2f (0)]−1. Thus, the ARE between the sign and the LS estimators is 4σ2f2(0). This is only 0.64 at the normal distribution, so medians are much less efficient than means if the errors have a normal distribution. Notice from Table 3.2 that for this mildly contaminated normal, the proportion of contamination must exceed 10% for the sign esti-mator to be more efficient than the LS estiesti-mator. The third row of the table contains the AREs between the Wilcoxon and sign estimators. In all of these
situations of the contaminated normal distribution, the Wilcoxon estimator is more efficient than the sign estimator.
If the true pdf of the errors is f (t), then the optimal score function is ϕf(u) defined in expression4 (3.20). These scores are asymptotically efficient;
i.e., achieve the Rao–Cramer lower bound asymptotically similar to maximum likelihood estimates. For example, the normal scores are optimal if f (t) is a normal pdf; the sign scores are optimal if the errors have a Laplace (double exponential) distribution; and the Wilcoxon scores are optimal for errors with a logistic distribution. The logistic pdf is
f (x) = 1 b
exp {−[x − a]/b}
(1 + exp {−[x − a]/b})2, −∞ < x < ∞, (3.45) At the logistic distribution the ARE between the sign scores estimator and the Wilcoxon estimator is (3/4), while at the Laplace distribution this ARE is (4/3). This range of efficiencies suggests a simple family of score functions called the Winsorized Wilcoxons. These scores are generated by a nonde-creasing piecewise continuous function defined on (0, 1) which is flat at both ends and linear in the middle. As discussed in McKean, Vidmar, and Siev-ers (1989), these scores are optimal for densities with “logistic” middles and
“Laplace” tails. Four typical score functions from this family are displayed in Figure 3.7. If the score function is odd about 1/2, as in Panel (a), then it is optimal for a symmetric distribution; otherwise, it is optimal for a skewed distribution. Those in Panels (b) and (c) are optimal for distributions which are skewed right and skewed left, respectively. The type in Panel (d) is opti-mal for light-tailed distributions (lighter than the noropti-mal distribution), which occasionally are of use in practice. In Rfit, these scores are used via the op-tions: scores=bentscores4 for the type in Panel (a); scores=bentscores1 for the type in Panel (b); scores=bentscores3 for the type in Panel (c); and scores=bentscores2for the type in Panel (d).
In the two-sample location model, as well as in all fixed effects linear mod-els, the ARE between two estimators summarizes the difference in analyses based on the estimators. For example, the asymptotic ratio of the squares of the lengths of the confidence intervals based on the two estimators is their ARE. Once we have data, we call the estimated ARE between two estimators the precision coefficient or the estimated precision of one analysis over the other. For instance, if we use the two rank-based estimators of ∆ based on their respective score functions ϕ1 and ϕ2 then
Precision(Analysis based on ˆ∆1, Analysis based on ˆ∆2) = ˆτϕ22 ˆ
τϕ21. (3.46) For a summary example of this section, we reconsider the quail data. This time we select an appropriate score function which results in an increase in precision (empirical efficiency) over the Wilcoxon analysis.
4See Chapter 2 of Hettmansperger and McKean (2011).
0.0 0.2 0.4 0.6 0.8 1.0
−1.00.01.0
u
phi(u)
Panel (a): bentscores4
0.0 0.2 0.4 0.6 0.8 1.0
−1.5−0.50.5
u
phi(u)
Panel (b): bentscores1
0.0 0.2 0.4 0.6 0.8 1.0
−0.50.51.5
u
phi(u)
Panel (c): bentscores3
0.0 0.2 0.4 0.6 0.8 1.0
−2−1012
u
phi(u)
Panel (d): bentscores2
FIGURE 3.7
Plots of four bent score functions, one of each type as described in the text.
Example 3.5.1 (Quail Data, Continued). The data discussed in Exam-ple 3.2.3 were drawn from a large study involving many datasets, which was discussed in some detail in McKean et al. (1989). Upon examination of the residuals from many of these models, it appeared that the underlying error distribution is skewed with heavy right-tails, although outliers in the left tail were not usual. Hence, scores of the type bentscore1 were considered. The final recommended score was of this type with a bend at u = 3/4, which is the score graphed in Panel (b) of the above plot. The Rfit of the data, using these scores, computes the estimate of shift δ in the following code segment:
> mybentscores = bentscores1
> mybentscores@param<-c(0.75,-2,1)
> fit = rfit(z~xmat,scores=mybentscores)
> summary(fit) Call:
rfit.default(formula = z ~ xmat, scores = mybentscores) Coefficients:
Estimate Std. Error t.value p.value (Intercept) 66.0000 4.1159 16.0355 1.216e-15 ***
xmat -16.0000 7.6647 -2.0875 0.04606 *
---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Multiple R-squared (Robust): 0.128132
Reduction in Dispersion Test: 4.11495 p-value: 0.05211 The estimate of tau for the bentscores is
> fit$tauhat [1] 19.79027
while the estimate of tau for the Wilcoxon scores is
> rfit(z~xmat)$tauhat [1] 21.64925
In summary, using these bent scores, the rank-based estimate of shift is
∆ = −16 with standard error 7.66, which is significant at the 5% level forb a two-sided test. Note that the estimate of shift is closer to the estimate of shift based on the boxplots than the estimate based on the Wilcoxon score; see Example 3.2.3. From the code segment, the estimate of the rela-tive precision (3.46) of the bent scores analysis versus the Wilcoxon analysis is (19.790/21.649)2= 0.836. Hence, the Wilcoxon analysis is about 16% less precise than the bentscore analysis for this dataset.
The use of Winsorized Wilcoxon scores for the quail data was based on an extensive investigation of many datasets. Estimation of the score function based on the rank-based residuals is discussed in Naranjo and McKean (1987).
Similar to density estimation, though, large sample sizes are needed for these score function estimators. Adaptive procedures for score selection are discussed in the next section.
As we discussed above, if no assumptions about the distribution can be reasonably made, then we recommend using Wilcoxon scores. The resulting rank-based inference is robust and is highly efficient relative to least squares based inference. Wilcoxon scores are the default in Rfit and are used for most of the examples in this book.