Algorithm for the CPD and COD - Canonical Parallel Direction (CPD) and Canonical Orthogonal Dir

2.2 Canonical Parallel Direction (CPD) and Canonical Orthogonal Direction

2.2.2 Algorithm for the CPD and COD

In this part, the algorithms for the computations of the two canonical directions are developed. We will also discuss the existence and uniqueness of these two directions. The discussion results in the proofs of Theorem 2.1.1 and Theorem 2.1.2.

Again, we assume that Xdn = (x1; ; xn) and Ydn = (y1; ; yn) are paired

HDLSS data sets, which means that xiand yi (i = 1; ; n) are the expression vectors for

associated samples. E.g. for the NCI60 data, we have X as the expression matrix for the cDNA samples and Y as the expression matrix for the corresponding Ay samples, mea- sured on the same list of genes. The direction vectors of the line segments which connect the same sample from dierent platforms are the columns of X Y .

Algorithm for the CPD

We intend to nd a vector vcpd which maximizes the sum of squared lengths of the

projected line segments in this direction. That is to maximize

i=1

kPvcpd(xi yi)k2 = vTcpd(X Y )(X Y )Tvcpd (over vcpd):

According to Lemma 2.2.2, vcpd is the rst eigenvector of (X Y )(X Y )T, which can

be easily calculated by eigenvalue analysis.

If the rst eigenvalue of (X Y )(X Y )T _{is strictly larger than all the rest eigenval-}

ues, the rst eigenvector of (X Y )(X Y )T _{exists and is unique (modulo the ip of}

direction), which means that the CPD exists and is unique. This proves Theorem 2.1.1.

Algorithm for the Canonical Orthogonal Direction

Before we give the algorithm for COD, we rst introduce some denitions and lemmas about the linear algebra.

Denition 2.2.2. A nonzero vector 2 Rd_{is called a normalized direction vector, if}

kk = 1.

Denition 2.2.3. We dene the following notations: HX : the space spanned by the column vectors of X.

H_{[X;Y ]}: the space spanned by all the column vectors of X and Y . HX Y: the space spanned by the column vectors of X Y .

Denition 2.2.4. HX?: the orthogonal complement of the space HX in Rd, which means

HX HX?= Rd.

H_{[X;Y ]=X} is dened as the orthogonal complement of the space HX in the space H[X;Y ],

which means HX H[X;Y ]=X = H[X;Y ].

Lemma 2.2.4. Let H be any proper subspace of Rd_{. H}? _{is the orthogonal complement of}

the space H. For any nonzero vector 2 Rd_{, there exist two normalized vectors} 1 2 H

and 2 2 H?, such that has an orthogonal decomposition:

= h; 1i1+ h; 2i2:

Note that If =2 H and =2 H?_{, the two such directions}

1 and 2 are unique (modulo

the ip of direction). The 1 is actually the direction vector of the projection of onto

the space H, and the 2 is the direction vector of the projection of onto the space H?.

Lemma 2.2.4, it can be orthogonally decomposed into two directions such that

vcod = hvcod; 1i1+ hvcod; 2i2;

where 1, 2 are normalized vectors and 1 2 H[X;Y ] , 2 2 H[X;Y ]?. Then the projection

of any vector v 2 H[X;Y ] on this normalized direction vcod can be expressed as:

Pvcod(v) = hv; vcodivcod

= hv; (hvcod; 1i1+ hvcod; 2i2) ivcod

= hv; hvcod; 1i1ivcod+ hv; hvcod; 2i2ivcod

= hvcod; 1ihv; 1ivcod+ hvcod; 2ihv; 2ivcod:

Since v 2 H_{[X;Y ]}, we have hv; 2i = 0 (because 22 H[X;Y ]?). Thus ,

Pvcod(v) = hvcod; 1ihv; 1ivcod: (2.1)

Recall that Denition 2.1.2 requires that vcod rstly needs be orthogonal to all the direction

vectors of the line segments, which means it is orthogonal to the space HX Y, thus

Pvcod(X Y ) = 0 =) Pvcod(X) = Pvcod(Y ):

Since X and Y have exactly the same projections on the direction vcod, the second condition

in Denition 2.1.2 actually assures that the COD is the one which maximizes the variability of the projections of the data in this direction. The projection of the ith sample of X can be expressed as Pvcod(xi). The center of the samples in X is x. Thus the variability of the

projected data on vcod is

i=1

kPvcod(xi x)k2:

Since xi x 2 H[X;Y ], we have n X i=1 kPvcod(xi x)k2 = n X i=1

khvcod; 1ihxi x; 1ivcodk2

= Xn

i=1

hvcod; 1i2vTcod(xi x)(xi x)Tvcod;

where 1 2 H[X;Y ].

In order to maximize this variation, we choose vcodsuch that hvcod; 1i = 1. This means

that vcod 2 H[X;Y ], i.e. the maximizing direction is in the subspace generated by the data.

Considering vcod ? HX Y (because vcod is orthogonal to all the direction vectors of the

line segments), we have vcod2 H[X;Y ]=(X Y ). This also means vcod2 H[X Y;Y ]=(X Y ), since

H_{[X;Y ]} = H_{[X Y;Y ]}.

Next, we will derive a set of basis vectors for the space H_{[X Y;Y ]=(X Y )}. Suppose the matrix [X Y; Y ] has an orthogonal-triangular decomposition

[X Y; Y ]d2n = Qd2nR2n2n;

where R is an upper triangular matrix, and Q is a d2n unitary matrix (QT_{Q = I}

2n2n). As

we mentioned in Theorem 2.1.2, the columns of X and Y are from continuous distributions, which assumes that [X Y; Y ] is a full rank matrix a.s. and hence both Q and R are full rank matrices a.s. These two matrices exist and are unique if we ignore the direction ip in Q and ignore the sign of the corresponding entries in R. We decompose Q as Q = [Q1; Q2], where Q1 is the rst n columns, and Q2 is the last n columns of Q. Because

R is a full rank upper triangular matrix, Q1forms a basis for the space HX Y and Q2forms

a set basis vectors for the space H_{[X Y;Y ]=(X Y )}, i.e.

HQ1 = HX Y;

Since vcod 2 H[X Y;Y ]=(X Y ) = HQ2, it can be expressed as a linear combination of the

columns of Q2 , say

vcod = Q2C;

where C is an n 1 vector.

The variation to be maximized (over , i.e. over C) is :

n X i=1 kPvcod(xi x)k2 = n X i=1 vT_cod(xi x)(xi x)Tvcod = Xn i=1 CT_(QT 2(xi x))((xi x)TQ2)C = CT_(QT 2(X X))(Q T2(X X)) TC:

From Lemma 2.2.3, in order to maximize the above variability, we choose C as the rst eigenvector of

QT₂(X X)(X X) TQ2:

To produce the canonical orthogonal direction, we rst calculate Q2 by the orthogonal-

triangular decomposition of [X Y; Y ], then we get C as the rst eigenvector of QT 2(X

X)(X X) T_Q

2 by the eigenvalue analysis. The canonical orthogonal direction is

vcod = Q2C:

When the columns of X and Y are independent with each other and are from distributions which are absolutely continuous with respect to d dimensional lebesgue measure, each of X, Y , and X Y is a full rank matrix a.s. Thus, the orthogonal-triangular decomposition exists and is unique a.s (modulo the ip of directions). Also, the rst eigenvector of QT

2(X X)(X X) TQ2 exists and is unique a.s. These establishes the existence and

uniqueness of the COD, and can be treated as the proof for Theorem 2.1.2.

Note that vcpd is the rst eigenvector of (X Y )(X Y )T, thus vcpd 2 HX Y. The

COD is orthogonal to all the directions of line segments, i.e. vcod2 H[X Y;Y ]=(X Y ). Thus

we have vcod ? vcpd. The denitions of these two directions assure that they are orthogonal

to each other, hence we could derive CPD and COD separately.

In document New statistical tools for microarray data and comparison with existing tools (Page 43-48)