FACTOR ANALYSIS NASC

(1)

FACTOR ANALYSIS

NASC

(2)

Factor Analysis…

A data reduction technique designed to represent a wide range of attributes on a smaller number of dimensions.

Aim is to identify groups of variables which are Aim is to identify groups of variables which are relatively homogeneous.

Groups of related variables are called ‘factors’.

(3)

Purposes

The main applications of factor analytic techniques are:

(1) to reduce the number of variables and

NASC@Courtesy Dr. Thagunna

(1) to reduce the number of variables and (2) to detect structure in the relationships

between variables, that is to classify

variables.

(4)

Factor 1 Factor 2 Factor 3

Conceptual Model for a Factor Analysis with a Simple Model

e.g., 12 items testing might actually tap only 3

underlying factors

(5)

Conceptual Model for Factor Analysis (with cross-loadings)

(6)

Common Factor Model

It is suggested that X

₁

, X

₂

, and X

₃

are functions of two underlying factors, F

₁

and F

₂

. It is assumed that each X variables are linearly related to the two factors as shown in the following model.

X

₁

= β

₁₁

F

₁

+ β

₁₂

F

₂

+ e

₁

β β

X

₂

= β

₂₁

F

₁

+ β

₂₂

F

₂

+ e

₂

X

₃

= β

₃₁

F

₁

+ β

₃₂

F

₂

+ e

₃

The error terms e

₁

, e

₂

, and e

₃

, serve to indicate that the hypothesized

relationships are not exact. In the vocabulary of factor analysis, the

(7)

Expected Structure of Loadings

Loading (F1) β

_i1

Loading (F2) β

_i2

X1 + 0

It is expected that the loadings have roughly the structure as shown in the table.

X1 + 0

X2 0 +

X3 0 +

Of course, the zeros in the preceding table are not expected to be exactly equal to zero. By `0' we mean approximately equal to zero and by `+' a positive number substantially different from zero.

(8)

Model Assumptions

A1: The error terms e

_i

are independent of one another and

E(e

_i

) = 0 and Var(e

_i

) = σ

_i²

,

A2:The unobservable factors are independent of one another.

It is also assumed that the factors and error terms are It is also assumed that the factors and error terms are independent.

As for the factor means and variances, the assumption is that the

factors are standardized: E(F

_j

) = 0 and Var(F

_j

) = 1. It is an

assumption made for convenience; since the factors are

unobservable, we might as well think of them as measured in

(9)

Implications of Assumptions…

The variance of X

_i

from the model can be expressed as

Var(X

_i

) = Var(F

₁

) + Var(F

₂

)+ Var(e

_i

) = + +

We see that the variance of X

_i

consists of two parts:

( + ) and .

• The first part is called communality of the variable. It is the

• The first part is called communality of the variable. It is the part of Var(X

_i

) explained by the common factors F

₁

and F

₂

.

• The second part is called specific variance of the variable. It is the part of Var(X

_i

) unable to explain by the common factors.

The covariance of any two observable variables, X

_i

and X

_j

, from the model can be expressed as

Cov(X

_i

, X

_j

) = β

i1

β

j1

Var(F

₁

)+ β

i2

β

j2

Var(F

₂

) = ββββ

i1

ββββ

j1

+ ββββ

i2

ββββ

j2 NASC@Courtesy Dr. Thagunna

(10)

History of Factor Analysis?

Invented by Spearman (1904)

Usage hampered by onerousness of hand calculation

Since the advent of computers, usage has thrived, esp. to develop:

esp. to develop:

• Theory

– e.g., determining the structure of personality

• Practice

– e.g., development of 10,000s+ of psychological

screening and measurement tests

(11)

Assumption Testing – Factorability

It is important to check the factorability of the correlation matrix

(i.e., how suitable is the data for factor analysis?)

• Check correlation matrix for correlations

• Check the anti-image matrix for diagonals

• Check measures of sampling adequacy (MSAs)

Bartlett’s

KMO

(12)

Rule of thumb: Measures of Sampling Adequacy

Are there several correlations over .3?

Are the diagonals of anti-image matrix > .5?

Is Bartlett’s test significant?

Is KMO > .5 ?

(13)

Assumption Testing – Factorability (Correlation and partial correlation)

Medium effort, reasonably accurate

Examine the diagonals on the anti-image

correlation matrix to assess the sampling adequacy of each variable

of each variable

Variables with diagonal anti-image correlations of

less that .5 should be excluded from the analysis –

they lack sufficient correlation with other variables

(14)

Assumption Testing – Factorability (Bartlett’s and KMO measure)

Quickest method, but least reliable

Sampling Adequacy predicts whether the data you have collected are likely to "factor well" based on correlation and partial correlation and this is measured by the Kaiser- Meyer-Olkin (KMO) statistic

Global diagnostic indicators - correlation matrix is factorable if:

Bartlett’s test of sphericity is significant and/or

(Null: no correlation among the variables(unit R matrix)

Kaiser-Mayer Olkin (KMO) measure of sampling

adequacy > .5

(15)

(16)

Communalities

The proportion of variance in each variable which can be explained by the factors

Also called the explained variation due to factor.

Communalities range between 0 and 1

High communalities (> .5) show that the factors High communalities (> .5) show that the factors

extracted explain most of the variance in the variables being analysed.

Low communalities (< .5) mean there is considerable

variance unexplained by the factors extracted

(17)

Eigen Values

EV = sum of squared correlations for each factor

EV = overall strength of relationship between a factor and the variables

Successive EVs have lower values

Eigen values over 1 are ‘stable’

(18)

Explained Variance

A good factor solution is one that explains the most variance with the fewest factors

Realistically happy with 50-75% of the

variance explained

(19)

Example: interpreting the communality

Variable (1)

Variance (2)

Loadings of F

₁

(3)

Loadings of F

₂

(4)

Communality (5)

% explained (6) = 100×(5)/(2)

Finance 1,0000 .0299 .9995 0.9999 99.9910

Marketing 1.0000 .9941 -.0815 0.9949 99.4940

Policy 1.0000 .9961 .0514 0.9949 99.4920

Overall 3.0000 1.9815 1.0083 2.9898 99.6590

The loadings on F

₁

are relatively large for marketing and policy but close to zero for finance. On the contrary, the loadings on F

₂

are relatively large for finance but relatively low for marketing and policy. This solution supports the expectation.

F

₁

could be interpreted as verbal ability, and F

₂

as quantitative ability.

(20)

The communalities show that the factor model explains nearly 100%, 99.5%, and 99.5% respectively of the observed variance of finance, marketing and policy grades. Overall, the two factors explain 99.65% of the sum of all observed variances.

The sum of squared loadings on F

₁

can be interpreted as the contribution of F

₁

, and that on F

₂

as the contribution of F

₂

in explaining the sum of the observed variances.

In our example F explains about 1.9815/3 or 66%, and F about 33.7% of the

Assessment of the First Solution based on R

In our example F

₁

explains about 1.9815/3 or 66%, and F

₂

about 33.7% of the sum of the observed variances.

Theoretically,

• the sum of squared loadings, 1.9815, is the largest eigenvalue of R and the loadings on F1 constitute the corresponding eigenvector.

• the sum of squared loadings, 1.0083, is the second largest eigenvalue of R and

(21)

(22)

How Many Factors?

A subjective process.

Seek to explain maximum variance using fewest factors, considering:

1. Theory – what is predicted/expected?

Eigen Values > 1? (Kaiser’s criterion)

2. Eigen Values > 1? (Kaiser’s criterion)

3. Scree Plot – where does it drop off?

4. Factors must be able to be meaningfully interpreted &

make theoretical sense?

(23)

Cattell & Jaspers (1967) suggest that the number of factors be taken as the number of eigenvalues immediately before the straight line begins.

before the straight line begins.

(24)

Scree Plot

A bar graph of Eigen Values

Depicts the amount of variance explained by each factor.

Look for point where additional factors fail to add appreciably to the cumulative explained variance.

variance.

1st factor explains the most variance

Last factor explains the least amount of variance

(25)

Factor Rotation

Factor loadings are not unique. There exist an infinite sets of factor loadings yielding the same theoretical dispersion matrix. The process of obtaining a new set of loadings with some specific objective is called

factor rotation.

Orthogonal (Varimax)

Oblimin

(26)

Factor loading stages

In practice, FA can be carried out in two stages.

• In the first stage, one set of loadings is estimated. These loadings may not agree with the prior expectations, or may not lend themselves to a reasonable interpretation.

• In the second stage, the first set of factor loadings are

"rotated" in an effort to arrive at another set that are more consistent with prior expectations or more easily interpretable.

• variables with cross-loading shall be omitted from the

(27)

How do I eliminate items?

A subjective process, but consider:

Size of main loading (min=.5) Size of cross loadings (max=.3?)

Eliminate 1 variable at a time, then re-run, before deciding which/if any items to eliminate next

Number of items already in the factor

More items in a factor -> greater reliability

More items in a factor -> greater reliability Minimum = 3

Maximum = unlimited

(28)

Factor Analysis: an example

suppose that an automobile company asked a large number of questions about different vehicles.

Consider how the different items (features) might be more parsimoniously represented by just a few

constructs (factors).

- Ideally, interval data (e.g., a rating on a k- point

scale), regarding the perceptions of consumers are

required regarding a number of features

(29)

(30)

(31)

(32)

We are looking for an eigenvalue above 1.0.

Cumulative percent of variance explained.

(33)

(34)

(35)

Expensive Exciting Luxury

Appeals to Others Attractive Looking Trend Setting

Reliable

Latest Features Trust

Luxury

Distinctive

Not Conservative Not Family

Not Basic

Trend Setting Trust

(36)

Expensive Exciting Luxury

Appeals to Others Attractive Looking Trend Setting

Reliable

Latest Features Trust

What shall these components be called?

Luxury

Distinctive

Not Conservative Not Family

Not Basic

Trend Setting Trust

(37)

Expensive Exciting Luxury

Appeals to Others Attractive Looking Trend Setting

Reliable

Latest Features Trust

EXCLUSIVE TRENDY RELIABLE

Luxury

Distinctive

Not Conservative Not Family

Not Basic

Trend Setting Trust

(38)

= (Expensive + Exciting + Luxury + Distinctive – Conservative – Family – Basic)/7

EXCLUSIVE

TRENDY

Calculate Component Scores(summated score)

= (Appeals to Others + Attractive Looking + Trend Setting)/3

= (Reliable + Latest Features + Trust)/3

RELIABLE

(39)

(40)

(41)

Exclusive Trendy Reliable

Beetle 1.4 6.7 6.9

Hummer 3.9 6.2 6.7

Lotus 4.1 7.3 6.7

Minivan -1.67 4.83 6.5

Pick-Up -0.43 4.93 6.3

Not much differing on this dimension.

(42)

Exclusive Trendy Reliable

Beetle 1.4 6.7 6.9

Hummer 3.9 6.2 6.7

Lotus 4.1 7.3 6.7

Minivan -1.67 4.83 6.5

Pick-Up -0.43 4.93 6.3

(43)

Practical session : using SPSS

Step 1: Open the data file, for example, Example.SAV

Step 2: Click on sequentially: Analyze → Data Reduction → Factor….

Step 3: Move the three variables – X1, X2 & X3 - from Source to Variable box

(44)

Step 4: Click on Descriptives. Activate

• Coefficients

• Significance levels

• KMO and Bartlett’s test of sphericity

• Anti-image Click on Continue.

This will produce correlation matrix and significance of correlations, sampling adequacy and test of sphericity.

Step 5: Click on Extraction. Activate Step 5: Click on Extraction. Activate

• Correlation Matrix

• Unrotated factor solution

• Eigenvalues greater than 1 Click on Continue.

This will produce loadings from correlation

matrix and the number of factors is same as

the number of eigenvalues greater than 1.

(45)

Step 7: Click on OK

Step 6: Click on Rotation.

Activate

• Varimax

• Rotated Solution Click on Continue

SPSS will produce 8 tables as outputs with table titles →

1. Correlation Matrix 2. KMO & Bartlett’s Test 3. Anti-image Matrices 4. Communalities

5. Total Variance Explained 6. Component Matrix

7. Rotated Component Matrix

8. Component Transformation Matrix

(46)

Composite Factor Values

Frequently, FA is not an end in itself but an intermediate step on the way to further analysis of the data. In such case we may require the composite values of each factor based on original/standardized data. In recent years, the composite values are generated through three techniques.

• Surrogated variables (A surrogated variable of a factor is a single variable that has the highest factor loading)

• Summated scale (The values of several variables defining a factor are summed and their total or average scores are considered)

• Factor scores (computer generated scores available under

(47)

Advantages & Disadvantages of the Techniques

Advantages Disadvantages

Surrogate Variables

Simple to administer and interpret

Does not represent all facets of a factor

Prone to measurement error Factor Scores Represent all variables through

loadings

Best method for complete data reduction

Interpretation more difficult because all variables contribute through loadings

reduction

By default orthogonal Summated

Scales

Compromise between the surrogate variable and factor score options

Reduce measurement error

Represent multiple facets of a concept

Include only the variables that load highly on the factor and exclude those having little or marginal impact

Not necessarily orthogonal

Require extensive analysis of reliability and validity

Source: Hair et al

(48)

Judging Practical Significance of FA

In interpreting factors, a decision must be made regarding the factor loadings. A factor loading is the correlation of the variable and the factor, the squared loading is the amount of the variable's total variation accounted for by the factor. Thus, a 0.3 loading translates to 9 per cent explanation; and a 0.5 loading denotes that 25% of the variation is accounted for by the factor. The loading must exceed 0.7 for the factor to account for 50% of the variation of the variable. Thus larger the absolute size of the factor loading, the more improvement the loading in interpreting the factor matrix using the practical significance as the criteria, we can the factor matrix using the practical significance as the criteria, we can assess the loadings as follows.

• Factor loadings in the range of ± 0.3 to ± 0.4 are considered to meet the minimal level for interpretation of structure

• Absolute value of loading 0.5 or greater are considered practically significant

•

(49)

Some Relations Among Output Values

A number of relations exist among outputs, which help us to understand and interpret outputs better. The major relations are the followings when input matrix is p × p correlation matrix.

1. Sum of all eigenvalues = p = total variance of p standardized variables.

2. Sum of squared factor loadings for the j

^th

factor = λ

j

= j

^th

largest eigenvalue

2. Sum of squared factor loadings for the j

^th

factor = λ

j

= j

^th

largest eigenvalue 3. λ

j

= amount of variance the j

^th

factor explains

4. λ

j

/p = proportion of variance explained by the j

^th

factor

5. Sum of squared factor loadings for the i

^th

variable = i

^th

communality

6. i

^th

communality = proportion of the variance of the i

^th

standardized variable explained by the common factor model

7. (i, j)

^th

factor loading is the correlation between the i

^th

variable and the j

^th