4.6 Prior results on the Latent Roots of sample covariance matrix
4.6.4 Determining whether the statistic is a pivot
For bootstrap analysis of any test statistic the only requirement is that the dis-tribution under the null should not be affected by any model specific features, see MacKinnon [2002] andMacKinnon [2006] for the philosophical discussion on inference on bootstrap. That is, we do not need to know the true model structure to identify the distribution of the test statistic under the null. For instance, we have an unbiased estimate of the mean of a random variable and its variance.
If the variable is IID normally distributed, then the distribution of the mean divided by the squareroot of the variance under the asumption that the mean is zero can be determined by drawing random samples of equivalent length data from a distribution with a zero mean and variance identical to the sample variance and constructing the same ratio. Inference is then based on the 100α percentiles as needed. Furthermore, as the series is presumed to be normal, be can sample with replacement the series, subtract the resampled mean and compute the re-sampled variance, to give a distribution of the test statistic. In both cases the test statistic is generated under the null of the mean being zero. In both cases
the resampling is normally distributed, in the first case by construction and in the second case using the presumed properties of the actual data.
Unfortunately, bootstrapping a z-score is one of the few tests where derivation of the pivot under the null is simple enough to illustrate without recourse to the underlying stochastic properties. Theorems 9 and 10 are re-derived from Muirhead[1982] in our notation, to show the pivot features indeed, the intention of Muirhead [1982] was to derive the Neyman-Pearson form of the statistic, but this is useful as it serves the same purpose for the bootstrap.
Theorem 8. Muirhead [1982] ,when the null hypothesis Hk is true, the limiting distribution, as n → ∞, of the statistic
Pk = −
Proof. Proof of Theorem 8 [Muirhead,1982]
Ec(Tk)
= − ∂
∂h[Ec(e−hTk)]h=0
We can exchange the order of differentiation, seeApostol[1969, Volume 2, Chap-ter 14] and integration because in a neighborhood of h = 0
h−1 1 −
This can obviously be done by finding
EN
Since Pm and finally, to get the expectation of quadratic variation we need to decompose the following expectation:
This problem is addressed in the following lemma using the seminal result of Balakrishnan [2006, Volume 2, Chapter 4].
where
q, from which it follows easily that
E(¯lqr) = 1 where we have used the fact that
m
Now
the accumulation of second-order (2×2) principal minors of S. Since the principal minors all give the same expectation, I need only to find the expectation involving the first one to find the bound
∆ = det
We can then multiply by
upper-triangular matrix and by construction t211 are independent χ2n0
−i+1 random
and
=
which completes the proof of the Theorem 9.
Let us now explain the proof of Theorem 8 :
Ec(e−hTk) = θ(h)
Substitution gives
E0(Tk) = d
n − k − (2q2+ q + 2)/6q − αd
n2 + O(n−3) (4.137)
from which it follows that if Pk is the statistic defined then
Ec(Pk) = d + O(n−2) (4.138)
and the proof is complete.
It follows from Theorem 3 that if n is large an approximate test of size α of the null hypothesis
Hk : λk+1 = · · · = λm (4.139)
is to reject Hk if Pk > c(α; (q + 2)(q − 1)/2), where Pk is given, q = m − k and c(α; r) is the upper 100α% point of the χ2r distribution. An estimate of λ is provided by
¯lq = q−1
m
X
i=k+1
li (4.140)
and it is easy to show, for example, that as n → ∞ the asymptotic distribution of 12nq1/2
(¯lq− λ)/λ is standard normal N(0, 1). Let zα be the upper 100α%
point of the N(0, 1) distribution, that is, such that Φ(zα) = 1 − α, where Φ(.) denotes the standard normal distribution function. Then asymptotically,
P
nq 2
1/2¯lq− λ λ
≥ −zα
= 1 − α (4.141)
which appoints to a one-sided confidence interval for λ, namely,
λ ≤
¯lq
1 − zα(2/nq)1/2 (4.142)
with asymptotic confidence coefficient 1 − α. If the upper limit of this confidence interval is sufficiently small we might decide that λ is negligible and study only the first k principal components.
Even if we cannot conclude that some of the smallest latent roots of Σ are equal, it still may be possible that the variation explained by the last q = m − k principal components, namely Pm
i=k+1λi, is small compared with the total variationPm
i=1λi, in which case we might decide to study only the first k principal components. Thus it is of interest to consider the null hypothesis
Hk∗ : Pm
i=k+1λi Pm
i=1λi = h (4.143)
where h(0 < h < 1) is a number to be specified by the experimenter. This can be tested using the statistic
Mk ≡
Assuming the latent roots of Σ are distinct, Corollary 2 shows that the limiting distribution as n → ∞ of
is normal with mean 0 and variance approximate test of Hk∗ and to give confidence intervals for
m
Finally, let me derive an asymptotic test for a given principal component. Let H∗∗ be the null hypothesis that the vector of coefficients h1 of the first principal components is equal to an specified m × 1 vector h01,i.e.,
H∗∗ : h1 = h01, h010h01 = 1 (4.148)
Recall that h1 is the eigenvector of Σ corresponding to the largest latent root λ1; we will assume that λ1 is a distinct root. A test of H∗∗ can be constructed using the result of Theorem 7, namely, that if q1 is the normalized eigenvector of the sample covariance matrix S corresponding to the largest latent root l1 of S then the asymptotic distribution of y = n1/2(q1− h1) is Nm(0, Γ), where
with H2 = [h2. . . hm] and
Note that the covariance matrix Γ in this asymptotic distribution is singular, as is to expected. Put z = B−1H20y; then the limiting distribution of z is Nm−1(0, Im−1), and hence the limiting distribution of z0z is χ2m−1. Now note that
z0z = y0H2B−2H20y (4.151)
and the matrix of this quadratic form in y is
H2B−2H20
Putting ∧ = diag(λ1, . . . , λm) and using
Hence the limiting distribution of
n(q1− h1)0 can be substituted for Σ, Σ−1, and λ1 without affecting the limiting distribution.
Hence, when H∗∗ : h1 = h01 is true, the limiting distribution of
W
= n(q1− h01)0
l1S−1− 2I + 1 l1S
(q1− h01)
= n
l1h010S−1h01+ 1 l1
h010Sh01− 2
is χ2m−1. It follows that a test of H∗∗ of asymptotic size α is to reject H∗∗ if W > c(α; m − 1), where c(α; m − 1) is the upper 100α% point of the χ2m−1 distribution.