Chapter 3 Global testing
3.2 P − value adjustment methods
The above global hypotheses can be evaluated usingp−value adjustment methods, such as the Bonferroni test. In the well-known Bonferroni correction test, the local outcomes, x1, x2, . . . , xK, are used to construct the statistics t1, t2, . . . , tK, with
correspondingp−valuesp1, p2, . . . , pK, to test the local null hypotheses H0k : µk= 0, k= 1,2, . . . , K.
The Bonferroni test rejects the k−th local null hypothesis H0k if and only if pk ≤ α/K. Note that the global null hypothesis H0 in (3.1) can be written as the inter- section of the local null hypotheses, formally
H0 =
K
\
k=1
H0k, (3.5)
and thus rejection of a local null hypothesis implies rejection of the global null hypothesis.
Thus, by ordering thep−values as p(1)≤p(2)≤ · · · ≤p(K), we can write the global Bonferroni test as
reject H0 iff p(1) ≤α/K. (3.6)
Due to the first-order Bonferroni inequality (Boole’s inequality) this procedure con- trols the type I error at the nominalα level [D’Agostino and Russell, 2005].
The global Bonferroni test is easy and convenient to apply and it does not require any distributional assumption to control the type I error. It can even be used if the observations of each dimension are measured at a different scale. On the other hand, the Bonferroni method relies entirely on the smallest p−value and can often be conservative (that is, type I error rate substantially lower than the
nominalα level) and inefficient (that is, power at unexpectedly low levels). This is the case when, as in our motivating example, high correlations exist between the local outcomes. The problem becomes worse as the dimension,K, of the observation vectors increases. Pocock et al. [1987] and Dmitrienko et al. [2010] found, through simulation studies, that for large positive correlations, especially when these are higher than 0.5, the type I error rate of the Bonferroni method is substantially lower than the nominal level. The results in Dmitrienko et al. [2010] show that the conservatism is considerably larger if the observation dimension is increased even from K = 2 to K = 5. We next consider the application of the global Bonferroni test to the examples in sections 2.3.2 and 2.4.2.
Example: Global Bonferroni test for fMRI and EEG study data
We compute the values of the t test statistics and the corresponding two-sided
p−values (12 degrees of freedom) at each of the 11 ROI.
Table 3.1: The local t− and p−values for the observations collected at each ROI used in the fMRI study.
ROI AC A C DL GP I OFC P SA T VS
tk 0.03 -0.72 1.80 1.28 1.55 0.21 0.76 0.59 1.14 1.10 1.46
pk 0.98 0.48 0.10 0.23 0.15 0.84 0.46 0.57 0.28 0.29 0.17
The smallest p−value p(1) = p3 = 0.10 is clearly larger than α/K ≈ 0.0045 for
α= 0.05,K = 11 and thus the Bonferroni method fails to reject H0.
Similarly, in the next table, we present the t− and p−values of the theta frequency observations recorded at each channel used in the EEG depression study.
Table 3.2: The localt−and p−values for the observations collected at each channel used in the EEG depression study.
ch. 3 4 5 6 7 8 17 18 19
tk 1.29 1.97 1.92 2.22 1.63 1.80 1.90 1.24 1.84
pk 0.22 0.06 0.07 0.04 0.12 0.09 0.07 0.23 0.08
The smallestp−value is p(1) =p4 = 0.04> α/K ∼= 0.0055 forα = 0.05,K = 9 and thus the Bonferroni method fails to rejectH0.
A number of modifications of the Bonferroni method exist in the literature. Simes [1986] global test rejects H0 if and only if p(k) ≤ kα/K, for at least one k,
k= 1,2, . . . , K. This test does not rely heavily on the smallestp−value and it is less conservative and more efficient than the Bonferroni method. Further, despite the slight increase in computation, it is still very easy and convenient to apply. However, Simes’ global test does not always control the type I error. Simes [1986] analytically proved that his test controls type I error for independent outcomes, while, through simulations, he showed that the type I error is also controlled for specific correlation structures under various distributions including the multivariate normal. Hommel [1988] also proposed two p−value adjustment methods, which control the type I error and are less conservative than Bonferroni method but more conservative than Simes global test [D’Agostino and Russell, 2005].
The above methods completely ignore correlations and thus they all become conservative when correlations are high. Some p−value adjustment methods ac- counting for correlations exist in the literature (for example James [1991], random field theory [Friston et al., 2007] and non-parametric [Westfall and Young, 1993] methods), but these tend to require complex calculations and they often rely on assumptions for specific observation structures in order to be efficient and/or to
maintain the type I error.
More generally, any multiple testing method can be re-written as a global test, because rejection of a single local null hypothesis implies rejection of the global null hypothesis. However, multiple testing methods treat local outcomes as indepen- dent entities rather than as components of a multivariate observation and they focus on detecting one or some few distinct local effects. Hence, they are more appropriate when we are interested in assessing local rather than global effects [D’Agostino and Russell, 2005; Dmitrienko et al., 2010]. Since these characteristics do not fit with our motivating application, we are driven to multivariate tests considered next.