• No results found

Canonical Correlation Analysis

N/A
N/A
Protected

Academic year: 2021

Share "Canonical Correlation Analysis"

Copied!
40
0
0

Loading.... (view fulltext now)

Full text

(1)

Canonical Correlation Analysis

Lecture 11 August 4, 2011

Advanced Multivariate Statistical Methods ICPSR Summer Session #2

(2)

Overview ●Today’s Lecture Canonical Correlations Computation Interpretation Another Example Other Analyses Wrapping Up

Today’s Lecture

Canonical Correlation Analysis

What it is

How it works

How to do such an analysis

(3)

Overview Canonical Correlations ●Purpose ●Concept ●Bivariate Correlation ●Multiple Correlation ●Canonical Correlation Computation Interpretation Another Example Other Analyses Wrapping Up

Purpose

In general, when we have univariate data there are times

when we would like to measure the linear relationship between things

The simplest case is when we have 2 variables and all we

are interested in is measuring their linear relationship. Here we would just use bivariate correlation

Another case is in multiple regression when we have

several independent variables and one dependent

variable. In this case we would use the multiple correlation coefficient (R2)

So, it would be nice if we could expand the idea used in

these to a situation where we have several y variables and several x variables

(4)

Overview Canonical Correlations ●Purpose ●Concept ●Bivariate Correlation ●Multiple Correlation ●Canonical Correlation Computation Interpretation Another Example Other Analyses Wrapping Up

Concept

From Webster’s Dictionary: canonical: reduced to the

simplest or clearest schema possible.

What do we mean by basic ideas?

In describing canonical correlation, we will start with the

basic cases where we only have two variables and build on it until we get to canonical correlations

1. First we will look at the bivariate correlation

2. Then we will see what was done to generalize bivariate correlation to the multiple correlation coefficient

3. Finally, these discussions will lead us right to what happens in canonical correlation analysis

(5)

Overview Canonical Correlations ●Purpose ●Concept ●Bivariate Correlation ●Multiple Correlation ●Canonical Correlation Computation Interpretation Another Example Other Analyses Wrapping Up

Bivariate Correlation

Begin by thinking of just two variables y and x

In this case the correlation describes the extent that one

variable relates (can predict) the other

That is...the stronger the correlation the more we will know

about y by just knowing x

(6)

Overview Canonical Correlations ●Purpose ●Concept ●Bivariate Correlation ●Multiple Correlation ●Canonical Correlation Computation Interpretation Another Example Other Analyses Wrapping Up

Multiple Correlation

On the other hand, if we have one y and multiple x variables

we can no longer look at a simple relationship between the two variables

But, we can look at how well the set of x variables can

predict the y by just computing the regression line

Using the regression line we can compute our predicted yˆ

and we can compare it to the y variable.

Specifically, we now have only two variables y and

ˆ

y = xb = so we can compute a simple correlation

Note: we started with something that was more complicated

(many x variables) and changed it in to something that we could compute a simple correlation (between y and ˆy)

(7)

Overview Canonical Correlations ●Purpose ●Concept ●Bivariate Correlation ●Multiple Correlation ●Canonical Correlation Computation Interpretation Another Example Other Analyses Wrapping Up

Multiple Correlation Example

From Weisberg (1985, p. 240).

“Property taxes on a house are supposedly dependent on the current market value of the house. Since houses actually sell only rarely, the sale price of each house must be estimated every year when property taxes are set. Regression methods are sometimes used to make up a prediction function.”

We have data for 27 houses sold in the mid 1970’s in Erie, Pennsylvania:

x1: Current taxes (local, school, and county) ÷ 100 (dollars)x2: Number of bathrooms

x3: Living space ÷ 1000 (square feet)x4: Age of house (years)

(8)

Overview Canonical Correlations ●Purpose ●Concept ●Bivariate Correlation ●Multiple Correlation ●Canonical Correlation Computation Interpretation Another Example Other Analyses Wrapping Up

Multiple Correlation Example

To compute the multiple correlation of x1, x2, x3, and x4 with y, first compute the multiple regression for all x variables and y:

proc reg data=house; model y=x1-x4;

output out=newdata p=yhat; run;

Then, take the predicted values given by the model, yˆ and correlate them with y:

proc corr data=newdata; var yhat y;

(9)

Overview Canonical Correlations ●Purpose ●Concept ●Bivariate Correlation ●Multiple Correlation ●Canonical Correlation Computation Interpretation Another Example Other Analyses Wrapping Up

(10)

Overview Canonical Correlations ●Purpose ●Concept ●Bivariate Correlation ●Multiple Correlation ●Canonical Correlation Computation Interpretation Another Example Other Analyses Wrapping Up

Multiple Correlation Example

(11)

Overview Canonical Correlations ●Purpose ●Concept ●Bivariate Correlation ●Multiple Correlation ●Canonical Correlation Computation Interpretation Another Example Other Analyses Wrapping Up

Canonical Correlation

Canonical correlation seeks to find the correlation between multiple x variables and multiple y variables

Now we have several y variables and several x variables so

neither of our previous two examples can directly apply, BUT we can take the points from the previous cases and use

them for this new case

So we could look at how well the set of x variables can

predict the set of y variables, but in doing this we still will not be able to compute a simple correlation

On the other hand, in the multiple regression we found a

linear combination of the variables bx to get a single variable

In our case we have two sets of variables so it makes

sense that we can define two linear combinations...one for the x variables (b1) and one for the y variables (a1)

(12)

Overview Canonical Correlations ●Purpose ●Concept ●Bivariate Correlation ●Multiple Correlation ●Canonical Correlation Computation Interpretation Another Example Other Analyses Wrapping Up

Canonical Correlation

In the simple case where we only have a single linear

combination for each set of variables we can compute the simple correlation between these two linear combinations

The first canonical correlation describes the correlation

between these two new variables (b1x and a

1y)So how do we pick the linear transformations?

These linear transformations (b1 and a1) are picked such

that the correlation between these two new variables is maximized

Notice that this idea is really no different from what we did in multiple regression

This also sounds similar to something we have done in PCA

(13)

Overview Canonical Correlations ●Purpose ●Concept ●Bivariate Correlation ●Multiple Correlation ●Canonical Correlation Computation Interpretation Another Example Other Analyses Wrapping Up

Canonical Correlation

ONE LAST THING

Think back to PCA when we said that a single linear

combination did not account for all of the information present in a data set...

Then we could determine how many linear combinations

were needed to capture more information (where the linear combinations were all uncorrelated)

We can do the same thing here...

We can define more sets of linear combinations (bi and

ai, i = 1, . . . , s where s = min (p, q), p is the number of

variables in the group of x and q is the number of variables in y)

Each linear combinations maximizes the correlation

between the new variables under the constraint that they are uncorrelated with all other previous linear

(14)

Overview Canonical Correlations Computation ●Computation ●Example #2 ●Standardized Weights ●Canonical Corr. Properties ●Hypothesis Test for Corr. Interpretation

Another Example Other Analyses Wrapping Up

Computation

To show how to compute canonical correlations, first

consider our original covariance matrix from our example:

x1 x2 x3 x4 y x1 8.3100 1.0700 1.3400 −15.0300 37.7400 x2 1.0700 0.1800 0.2100 −1.2500 5.6000 x3 1.3400 0.2100 0.3100 −1.4000 7.4200 x4 −15.0300 −1.2500 −1.4000 197.4900 −62.3900 y 37.7400 5.6000 7.4200 −62.3900 204.7000

(15)

Overview Canonical Correlations Computation ●Computation ●Example #2 ●Standardized Weights ●Canonical Corr. Properties ●Hypothesis Test for Corr. Interpretation

Another Example Other Analyses Wrapping Up

Computation

From this matrix, we will define four new sub-matrices, from

which we will calculate our correlations:

x1 x2 x3 x4 y x1 x2 Sxx Sxy x3 x4 y Sxy Syy

(16)

Overview Canonical Correlations Computation ●Computation ●Example #2 ●Standardized Weights ●Canonical Corr. Properties ●Hypothesis Test for Corr. Interpretation

Another Example Other Analyses Wrapping Up

Computation

So how do we compute the canonical correlations?

To begin, note that we could define the Squared Multiple

Correlation R2M as

R2M = |Sxy

S−1

xxSxy|

|Syy|

which can be rewritten as:

R2M = |S−yy1SyxS

1

xxSyx|

For canonical correlations, however, we will focus on the

matrix formed by the part of the equation within the | · | (note this was just a scalar when y only has one variable)

(17)

Overview Canonical Correlations Computation ●Computation ●Example #2 ●Standardized Weights ●Canonical Corr. Properties ●Hypothesis Test for Corr. Interpretation

Another Example Other Analyses Wrapping Up

Computation

We first compute the square root of the eigenvalues

(r1, r2, . . . , rs) and the eigenvectors (a1,a2, . . . ,as) of:

Syy1SyxS

1

xx Sxy

Then we compute the square root of the eigenvalues

(r1, r2, . . . , rs) and the eigenvectors (b1,b2, . . . ,bs) of:

Sxx1SxyS

1

yy Syx

Conveniently, the eigenvalues for both equations are equal

(and are between zero and one)!

The square root of the eigenvalues represents each

successive canonical correlation between the successive pairs of linear combinations

From the eigenvectors we have determined the linear

(18)

Overview Canonical Correlations Computation ●Computation ●Example #2 ●Standardized Weights ●Canonical Corr. Properties ●Hypothesis Test for Corr. Interpretation

Another Example Other Analyses Wrapping Up

Example #2

To illustrate canonical correlations, consider the following

analysis:

Three physiological and three exercise variables are measured on 27 middle-aged men in a fitness club

The variables collected are:

Weight (in pounds - x1)

Waist size (in inches - x2)

Pulse rate (in beats-per-minute - x3)

Number of chin-ups performed (y1)

Number of sit-ups performed (y2)

(19)

Example #2

To run a canonical correlation analysis, use the following code:

proc cancorr data=Fit all

vprefix=Physiological vname=’Physiological Measurements’ wprefix=Exercises wname=’Exercises’;

var Weight Waist Pulse; with Chins Situps Jumps; run;

(20)
(21)
(22)

Overview Canonical Correlations Computation ●Computation ●Example #2 ●Standardized Weights

●Canonical Corr. Properties ●Hypothesis Test for Corr. Interpretation

Another Example Other Analyses Wrapping Up

Standardized Weights

Just like in PCA and Factor Analysis, we are interested in

interpreting the weights of the linear combination

However, if our variables are in different scales they are

difficult to interpret

So, we can standardize them, which is the same as

computing the canonical correlations and linear combination of the correlation matrix instead of using the the

variance/covariance matrix

We can also compute the standardize coefficients (c and d)

directly:

c = diag(Syy)

1 2a

(23)
(24)

Overview Canonical Correlations Computation ●Computation ●Example #2 ●Standardized Weights

●Canonical Corr. Properties

●Hypothesis Test for Corr. Interpretation

Another Example Other Analyses Wrapping Up

Canonical Corr. Properties

1. Canonical correlations are invariant.

This means that, like any correlation, scale changes (such

as standardizing) will not change the correlation.

However, it will change the eigenvectors...

2. The first canonical correlation is the best we can do with associations.

Which means it is better than any of the simple

correlations or any multiple correlation with the variables under study

(25)

Overview Canonical Correlations Computation ●Computation ●Example #2 ●Standardized Weights ●Canonical Corr. Properties

●Hypothesis Test for Corr.

Interpretation Another Example Other Analyses Wrapping Up

Hypothesis Test for Corr.

We begin by testing that at least the first (the largest)

correlation is significantly different from zero

If we cannot get a significant relationship out of the optimal

linear combination of variables this is the same as testing

H0 : Σxy = 0 or B1 = 0

This is tested using Wilk’s Lambda:

Λ1 =

|S| |Syy||Sxx|

Or, equivalently (where r2

i is the eigenvalue from the matrix

term produced from the submatrices of the covariance matrix): Λ1 = s Y i=1 (1 − r2i)

(26)

Overview Canonical Correlations Computation ●Computation ●Example #2 ●Standardized Weights ●Canonical Corr. Properties

●Hypothesis Test for Corr.

Interpretation Another Example Other Analyses Wrapping Up

The Rest

In this case Λ1 as Λ1 = s Y i=1 (1 − r2i)

which can be compared to Λα,p,q,n1q (or to a Λα,q,p,n1p)

In general we can compute

Λj = s

Y

i=k

(1 − ri2)

which can be compared to Λα,pk+1,qk+1,nkq (or to a

(27)
(28)

Overview Canonical Correlations Computation Interpretation ●Interpretation ●Standardized ●Correlation of Linear Combination with Variables ●Rotation ●Redundancy Another Example Other Analyses Wrapping Up

Interpretation

Because in many ways a canonical correlation analysis is

similar to what we discussed in PCA, the interpretation methods are also similar

Specifically, we will discuss four methods that are used to

interpret the results:

1. Standardized Coefficients

2. Correlation between Canonical Variates (the linear combination) and each variable

3. Rotation

(29)

Overview Canonical Correlations Computation Interpretation ●Interpretation ●Standardized ●Correlation of Linear Combination with Variables ●Rotation ●Redundancy Another Example Other Analyses Wrapping Up

Standardized

Because the standardized variables are on the same scale

they can be directly compared

Those variables that are most important to the association

are the ones with the largest absolute values (i.e., determine importance)

To interpret what the linear combination is capturing we will

(30)

Overview Canonical Correlations Computation Interpretation ●Interpretation ●Standardized ●Correlation of Linear Combination with Variables

●Rotation ●Redundancy Another Example Other Analyses Wrapping Up

Correlation of Linear Combination with Variab

This was mentioned in PCA and EFA...

That is, we compute our linear combinations and then

compute the correlation between the linear combination (canonical variates) with each of the actual variables

The correlations are typically called the loadings or

structure coefficients

As was the case in PCA this ignores the overall

multidimensional structure and so it is not a recommend analysis to make interpretations from

(31)

Overview Canonical Correlations Computation Interpretation ●Interpretation ●Standardized ●Correlation of Linear Combination with Variables

●Rotation ●Redundancy Another Example Other Analyses Wrapping Up

Rotation

We could try rotating the weights of the analysis to provide

an interpretable result...

For this we begin to rely on the spacial representation of

what is going on with the data

Every linear combination is projecting our observations on to

a different dimension

Sometimes these dimensions are difficult to interpret (i.e.,

based on the sign and magnitude

Sometimes we can rotate these dimensions so that the

weights are easier to interpret

Some are large and some are small

Rotations in CCA are not recommended, because we lose

(32)

Redundancy

Another method for interpretation is a redundancy analysis (this, again, is

often not liked by statisticians because it only summarizes univariate relationships)

(33)
(34)
(35)

Overview Canonical Correlations Computation Interpretation Another Example Other Analyses Wrapping Up

Another Example

In a study of social support and mental health, measures of

the following seven variables were taken on 405 subjects:

Total Social Support

Family Social Support

Friend Social Support

Significant Other Social Support

Depression

Loneliness

Stress

The researchers were interested in determining the

relationship between social support and mental health...how about using a canonical correlation analysis?

(36)

*SAS Example #3;

data depress (type=corr);

_type_=’corr’; input _name_ $ v1-v7; label v1=’total social support’

v2=’family social support’ v3=’friend social support’

v4=’significant other social support’ v5=’depression’ v6=’loneliness’ v7=’stress’; datalines; v1 1.00 . . . . v2 0.8280 1.0000 . . . . . v3 0.8136 0.5192 1.0000 . . . . v4 0.8569 0.5972 0.6109 1.0000 . . . v5 -0.3691 -0.3218 -0.3150 -0.3044 1.0000 . . v6 -0.6282 -0.4945 -0.5774 -0.5266 0.5368 1.0000 . v7 -0.1849 -0.2049 -0.1132 -0.1291 0.4872 0.2846 1.000 ;

proc cancorr data=depress all corr edf=404

vprefix=Mental_Health vname=’Mental Health’ wprefix=Social_Support wname=’Social Support’; var v1-v4;

with v5-v7; run;

(37)

Overview Canonical Correlations Computation Interpretation Another Example Other Analyses Wrapping Up

(38)

Overview Canonical Correlations Computation Interpretation Another Example Other Analyses Wrapping Up

(39)

Overview Canonical Correlations Computation Interpretation Another Example Other Analyses ●Other Analyses Wrapping Up

Other Analyses

In general, the results from a canonical correlations routine

are related to:

1. Regression

2. Discriminant Analysis (we will learn this next week)

3. MANOVA

However, the goals of canonical correlation overlap with the

information provided by a confirmatory factor analysis or structural equation model...

(40)

Final Thought

The midterm was accomplished using MANOVA and MANCOVA.

Canonical correlation analysis is a complicated analysis that provides many

results of interest to researchers.

Perhaps because of it’s complicated nature, canonical correlation analysis is

not often used.

Last week: Nebraska...This week: Texas...After that: The world.Tomorrow: Lab Day! Meet in Helen Newberry’s Michigan Lab

References

Related documents

Any SSO architecture that may employ multiple authentication servers (authentication information replication, token-based, proxy-based, and identity-provider redirection

To better understand trophic relationships and position of the two gobiids, other abundant and co- existing fish species in the studied lakes were also sampled

Locke’ reported death in 6 of 33 premature infants given ACTH for retrolental

Capital Expenditure (abbreviated to Capex) describes this process of internal investment control (Management Reporting), which serves as a supervision of the budget and supports

Host an overall discussion with the entire group, asking people to share key elements of their dreams – things they saw that made them feel good about the community.

We introduce product of output fuzzy subsystems and prove that it is actually output fuzzy subsystem of various products of fuzzy Moore machines..

fields (6 April bioassay, Table 1), from adult boll wee - vils reared from infested squares (20 June, 25 June, 30 June, and 17 July bioassays, Tables 2-5) and adult boll weevils

 Habitat Assessment scores and their related metric scores  Biological Index scores and their related metric scores.. Name of Organism Size of Organism Organism