L eave-O ne-O ut P ertu rb a tion s - Efficient cross-validatory computations and influence mea

T he dow ndated principal com ponent m atrix Q(^) for th e deletion of th e d atu m is obtained from the orthogonal eigendecomposition

of th e dow ndated cross-products m atrix C(*) = X^^Xp), where Xi — Xp)

X(i) = Xi-i — X(j)

Xi+l _id

\ — X(i) /

w ith X(j) the m ean of th e leave-one-out dow ndated predictor data. We have C(i) = C - z^xfxi,

w ith 1/ = n / { n — l) [Seb84, page 15] and hence, employing th e principal com

ponent decomposition C = Q E Q ^ , the leave-one-out m odification problem reduces to th e eigendecomposition of the m atrix

E — p z ^ z ,

where z = X iQ /||x iQ || and p = i/||x iQ |p = z/||xi||^.

From th e principal com ponent decom position E — pz^z = V E (i)V ^,

we obtain th e dow ndated m atrix E(^) w ith the eigenvalues . . . , on

th e diagonal. The dow ndated principal components m ay th en be derived from

This determ ines the framework for leave-one-out principal com ponent dow ndating. T he dow ndating principal com ponents V will be com puted in two stages. W ith this term inology we distinguish the com ponents V which are applied to dow ndate Q from th e dow ndated m atrix Q(j) itself. T he down- dated principal com ponent eigenvalues are com puted first. The principal com ponents V are th en obtained as a function of th e dow ndated eigenvalues

ei(i),. . . , th e eigenvalues e i , . . . , and the principal com ponents Q.

D e fla tio n

It is im p o rtan t th a t we study those situations for which th e dow ndating problem is trivial, so as to identify a well-behaved problem . This will also reduce th e workload in any im plem entation, as those com ponents th a t do not dow ndate m ay be removed from the com putations im m ediately.

We m ay always assum e th a t p is strictly positive. Indeed, p = 0 is

equivalent to z = 0, which corresponds to th e trivial dow ndating problem in

which none of th e principal com ponent eigenvalues and principal com ponents are dow ndated. T he geom etric in terpretatio n of this phenom enon is th a t the cross-validated observation is situated at th e m ean of th e data. Hence this observation does not contribute to the sum of squared projections of the d a ta on th e principal com ponent directions and, as a consequence, it can not influence any of th e principal com ponent directions.

The second situ atio n is th a t in which zi = 0 for some 1 < / < In this

case 6/ — ^i{i)

( E - p z ^ z ) g / = E g / = e /g /,

where g; = (0, . . . , 0,1,0, . . . , 0) is th e stan d ard basis vector, and hence,

V/ = gi. This follows as th e ordering of th e dow ndated com ponents is un affected, as m ay be seen from th e present deflation algebra and theorem

I < m < then we can find a perm u tatio n P ( / , m ) of elem ents,

0 • • • 0

. 1

I m

th a t interchanges the positions of zi and Zm in z, while leaving all other

elem ents of z unchanged. This am ounts to a perm u tatio n of the principal com ponent directions, and hence, th e same p erm u tatio n m ust also be applied to th e m atrix of principal com ponent coefficients Q as well as to th e diagonal m atrix of eigenvalues E. We have th a t

P (/, m f E P { l , m )

is a diagonal m atrix. As a consequence, all zero elem ents of z m ay be p er m uted to th e ‘b o tto m ’ of the eigensystem.

The last case is th a t for which e/ = for some 1 < I < m < In this

elem ents,

J ( / , m, $)

c o s { 0 )

cos(0) m

such th a t the com ponent of

z J ( /, m , 6)

is zero. As for the previous case, this rotation m ust be applied to the diagonal m a trix of eigenvalues E and to th e m atrix of eigenvectors Q . As for the previous case, we find th a t

is diagonal. We m ay now apply th e previous analysis and p erm u te th e ro tated dimension to th e ‘b o tto m ’ of th e eigensystem.

From a repeated application of th e above steps we m ay find an orthogonal

transform ation m atrix T such th a t T ^ E T = di a g( d i , . . . , w ith di >

. . . > dk and ^ = z T w ith (k+i = " ' = (r^ = 0 . We have shown th a t the

dow ndating of the last r^ — k com ponents of this transform ed eigensystem is

trivial. T he dow ndating of th e first k com ponents is a subproblem of th e same

ty p e as before, but w ith distinct eigenvalues and no zero principal com ponent scores for th e left-out observation. Hence, we m ay restrict ourselves to the decom position of such eigensystems.

T h e e ig e n s y s te m o f E — p z ^ z

We will assume th a t all eigenvalues are distinct, such th a t ei > • • • > and

th a t z has no zero components.

Bunch, Nielsen and Sorensen [BNS78] considered th e following theorem for th e com putation of the m atrix of dow ndating principal com ponents V =

T h e o r e m 2.1 I f A is a square invertible r e a l n b y n matrix, u a n d v are real n by 1 vectors and p is a real number different f r o m zero, then the s t a t e m e n t s

(A + pu v^ )x = b.

and

A u X b

.6». .0 .

X = A — OA ^u, = v^ A ^b, with ^ v ^ A ^u,

In document Efficient cross-validatory computations and influence measures for principal component and partial least squares decompositions with applications in chemometrics (Page 32-36)