Projection - Some needed linear algebra results

1.6 Some needed linear algebra results

1.6.7 Projection

All projections are defined in terms of inner products. Therefore, before discussing projections, the definition and characteristics of an inner product must be considered. All inner products can be expressed as a bilinear form. A bilinear form on the other hand only qualifies as an inner product if the matrix of the bilinear form is symmetric and positive definite. Consider two vectors,aandb, inRp_{. The function}

a′Mb is called a bilinear form in aand b and the matrix M is called the matrix of the bilinear form. Only when the matrixM is symmetric and positive definite does the bilinear form a′Mb qualify as an inner product for Rp_{, because only then are}

the following four conditions satisfied:

a′Mb=b′Ma a′Ma>0 for a≠0

(ka)′Mb=k(a′Mb) and(a+g)′Mb=a′Mb+g′Mb.

Note that in this thesis all positive definite matrices are considered to be symmetric. Hence, the requirement for a bilinear form to qualify as an inner product for Rp_{, is}

that the matrix of the bilinear form must be positive definite. The inner product a′Mb, is said to be with respect to M or in the metric M. Let the inner product in the metric M be denoted by⟨a,b⟩_M, that is

⟨a,b⟩_M=a′Mb.

The Euclidean inner product, often referred to as the usual inner product, is the inner product given by the bilinear form where the matrix of the bilinear form is the identity matrix,I. That is, the Euclidean inner product between two vectors, a and b, is given by

⟨a,b⟩_I=a′b.

It will be assumed that when the subscript is omitted from the inner product no- tation, the inner product being referred to is the inner product in the metric I, i.e.

the usual (Euclidean) inner product, that is

⟨a,b⟩=⟨a,b⟩_I =a′b.

A vector space in which the inner product is defined by the Euclidean inner product, is called a Euclidean inner product vector space.

When Mis positive definite and the inner product in the metric M is chosen to be the inner product for Rp_{, then two vectors,} _a _and _b_{, are orthogonal if and only}

ifa′Mb=0. When a′Mb=0,a and b are said to be orthogonal with respect toM or orthogonal in the metric M. Let the orthogonality of a and b in the metric M be denoted byaMb, that is

aMb≡a′Mb=0.

Consider a p×q matrix, L, which is such that M=LL′. It is shown below that two vectorsaand b inRp _{are orthogonal in the metric}_M _{if and only if the vectors}

L′a andL′b are orthogonal in the metricI i.e. orthogonal with respect to the usual inner product (Harville, 1997):

aMb≡a′Mb=0

Ð→a′LL′b=0 Ð→(L′a)′(L′b)=0

Ð→(L′a)I(L′b) .

When the inner product on Rp _{is defined to be the inner product in the metric}

M, the projection of a vectora inRp _{onto another vector,} _b_{, in} _Rp _{is given by}

⟨a,b⟩_M

⟨b,b⟩_Mb=

a′Mb

b′Mbb. (1.6.9)

When a and b are two vectors in a p-dimensional Euclidean inner product vector space, then the projection ofa ontob is given by

⟨a,b⟩_I

⟨b,b⟩_Ib= a′b

Let a and b be elements of an p-dimensional inner product vector space W, in which the inner product betweenaand bis defined in the metricM andV(B)be a subspace of W. The projection of aontoV(B) in the metric Mis given by z=By wherey is any solution of the linear system

B′MBy=B′Ma. (1.6.11)

The linear system in equation (1.6.11) is always consistent (Harville, 1997). Every solution of the linear system in (1.6.11) is of the form,

y=(B′MB)−B′M′a

where (B′MB)− is a generalised inverse of B′MB. When B is non-singular, the matrix B′MB is also non-singular and hence the linear system in (1.6.11) has a unique solution,

y=(B′MB)−1B′M′a.

WhenB is singular, the projection ofaonto the V(B)in the metricMis therefore given by

B(B′MB)−B′M′a.

The matrix B(B′MB)−1B′M′ is called the projection matrix for projection onto the column space ofB in the metricM. The projection matrix,B(B′MB)−B′M′, is invariant to the specific generalised inverse of the matrix (B′MB) that is used (Harville, 1997). When B is non-singular, the projection of a onto V(B) in the metricM is given by

B(B′MB)−1B′M′a. (1.6.12) Note that if the matrix B is reduced to consist of one column vector only, then equation (1.6.12) simplifies to equation (1.6.9).

It can be shown that the weighted sums of squares,

(a−By)′M(a−By)

is minimised when y is a solution of the linear system of equations in (1.6.11) and hence whenBy is the projection of aonto the column space of B in the metric M (Jolliffe, 2002).

Consider again the p×q matrix L which is such that LL′=M.

Ifz is the projection of aonto the column space of B in the metric M, then L′z is the projection of L′aonto the column space ofL′B in the metric I. This is evident upon substituting LL′ for M in the linear system of equations in (1.6.11):

B′MBy=B′Ma

Ð→B′LL′By=B′LL′a

Ð→(L′B)′(L′B)y=(L′B)′(L′a) . (1.6.13)

It is evident that if y is a solution of the linear system in (1.6.13), then L′By will equal the projection of L′a onto the column space of L′B in the metricI.

Ifaandbare elements of a p-dimensional Euclidean inner product vector space,

W, and V(B) is a subspace of W, then the projection ofa onto V(B) is given by z=By where yis any solution of the linear system

B′By=B′a. (1.6.14)

Every solution,y, of the linear system in equation (1.6.14) has the form

y=(B′B)−B′a

where(B′B)− is a generalised inverse of B′B. When B is non-singular, the matrix, B′B, is also non-singular and hence the linear system in equation (1.6.14) has a

unique solution given by

y=(B′B)−1B′a.

When B is singular, the projection of aonto the subspace V(B), is therefore given by

B(B′B)−B′a.

The projection matrix,B(B′B)−B′, is invariant to the choice of generalised inverse,

(B′B)−, of(B′B)(Harville, 1997). WhenBis non-singular, the projection ofaonto the subspace,V(B), is given by

B(B′B)−1B′a. (1.6.15) The matrix B(B′B)−1B′ is called the projection matrix for projection onto the column space of B in the metric I, or simply the projection matrix for projection onto the column space of B. Note that, if the matrix, B, is reduced to consist of one column vector only, then equation (1.6.15) simplifies to equation (1.6.10).

It can be shown that the sums of squares

(a−By)′(a−By) (1.6.16) is minimised whenyis a solution of the linear system in 1.6.14 and hence whenBy is the projection of aonto the column space of B in the metric I.

1.6.7.1 Projection onto an affine subspace

Let x be some vector in Rp _{and let} _W _{denote a linear subspace of} _Rp _{spanned by}

the column vectors of the matrixV. Each point inW can therefore be expressed in the form, α′V′, where α ∈R_{. Let} L _{denote the linear affine subspace obtained by}

offsettingW by a pointp. Each pointy that lies inLcan therefore be expressed in the form,

The point, y ∈ L which is closest to x in terms of Pythagorean distance is the orthogonal projection ofxontoL. Letxˆ denote the orthogonal projection ofxonto

L. That is, ˆ x=argmin y∈L {∥x−y∥ 2_} .

Sincexˆ lies in L,xˆ can be expressed in the form,

x=p+αˆ′V′. It follows thatxˆ=p+αˆ′V′, where

ˆ α=argmin α {∥x−(p+α′V′)∥2} Ð→αˆ =argmin α {∥(x−p)−α′V′∥2} .

The pointα′V′inWwhich is closest to the point,x−pinRp_{in terms of Pythagorean}

distance, that is the point which minimises∥(x−p)−α′V′∥2 over allα, is given by the orthogonal projection ofx−p onto W, that is

α′V′=(x−p)′VV′.

It follows that the point in L which minimises ∥x−y∥2 over all y ∈ L, that is the orthogonal projection of xonto Lis given by

x=p+(x−p)′VV′.

In document PCA and CVA biplots : a study of their underlying theory and quality measures (Page 35-40)