• No results found

Statistics for Business Decision Making

N/A
N/A
Protected

Academic year: 2021

Share "Statistics for Business Decision Making"

Copied!
62
0
0

Loading.... (view fulltext now)

Full text

(1)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative applications

Statistics for Business Decision Making

Faculty of Economics University of Siena

(2)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative

You should be able to:

ˆ Summarize and uncover any patterns in a set of multivariate data using the Factor Model (FM)

ˆ Apply factor analysis to business decision-making situations

(3)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative applications

Data reduction

ˆ Factor Analysis (FA) is a multivariate statistical technique of data reduction

ˆ Starting point: a large dataset with many correlated variablesX1,X2, ...,Xk. Interdependence among the

variables is explored. Due to their correlation, the information content of a given variable may overlap with the information content of any other variable, thus producing a double counting of the same information in the original dataset

ˆ Through FA a smaller set of new unobserved variables (the common factors) is identied that can be used toexplain the interrelationships among the original variables.

(4)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative

How do the factors explain the association among

the original variables

To say that the factors explain the associations among the original variables means that

the original variables are assumed to beconditionally independent

given the factors.

In other words, any correlation between each pair of measured (manifest) variables arises because of their mutual association with the common factors.

(5)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative applications

Aim of Factor Analysis

The denition and interpretation of a smaller number(m<k)

of new variablesF1,F2, ...,Fm (called factors, often to be

thought of as latent constructs) that capture the statistical information contained in the original variables

ˆ Advantage: reduction in the complexity of the data, greater simplicity in describing the observed phenomenon

ˆ Disadvantage: loss in information plus the introduction of an error component

Trade-o: how much loss in the original information are we disposed to accept just to achieve a more parsimonious data summary?

Usually the stronger the correlations among the original variables the smaller the number of factors needed to adequately summarize the information

(6)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative

Exploratory FA vs. Conrmatory FA

Exploratory Factor Analysis starts from observed data to identify unobservable and underlying factors, unknown to the researcher but expected to exist from theory

Conrmatory Factor Analysis the researcher wants to test one or more specic underlying structures, specied prior to the analysis. This is frequently the case in psychometric studies

(7)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative applications

Latent Variable Models (LVM)

Factor Analysis may be classied within the framework of Latent Variable Models (LVM).

LVM are used to represent the complex relations among several manifest variables by simple relations between the variables and an underlying latent structure.

Factor Analysis is a Latent Variable Model where both manifest and latent variables are measured on a metrical scale

(8)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative

Factor Analysis in Marketing

Many steps are involved:

1 Identify the main attributes used to evaluate a product/service (for a toothpaste these may be the benets provided in preventing plaque and tartar, freshening the breath, keeping the gums healthy, keeping the mouth clean, etc.)

2 Collect data from a random sample of potential customers on

their ratings of all the product attributes (for example on a Likert scale ranging from 1 to 5)

3 Run a factor analysis for nding a set of underlying factors that summarize the respondents attitude towards that

product/service

4 Use the new smaller set of factors to either construct perceptual

maps and other product positioning services or to simplify subsequent analysis of the data (through regression models or clustering methods)

(9)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative applications

Example 1 - Attitude and consumer behaviour

towards supermarkets

Original variables: items that measure consumer's attitudes towards supermarkets

ˆ convenience in reaching the store

ˆ product prices

ˆ store location

ˆ sales promotion

ˆ width of aisle in the store

ˆ store athmosphere and decoration

ˆ store size Aim:

1 to summarize the original dataset into a smaller number of dimensions (through FA)

2 to evaluate the eect of the summary dimensions on the choice of the preferred kind of supermarkets (through logit regression). Being the factors uncorrelated, multicollinearity is not a matter of concern

(10)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative

Example 2 - Buying behaviour towards local

products

Original variables: a set of attitudinal statements relating to dierent aspects of consumers' buying behaviour towards local products

ˆ production methods

ˆ appearance of a special label

ˆ use of no chemical adds

ˆ help of local economy

ˆ price, quality and nutrition value

ˆ environmental and health protection

ˆ external appearance

ˆ attractiveness of packing

ˆ freshness and taste

ˆ prestige and curiosity Aim:

1 to identify a smaller number of underlying factors that aect consumers buying behaviour towards local products (through FA) 2 to use the new factors for grouping consumers with similar patterns

(11)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative applications

Linear Factor Model

Each observed variableXjis linearly related tomcommon factorsF1,F2, ...,Fmand a unique componentεj X1=γ11F1+γ12F2+...+γ1mFm+ε1 X2=γ21F1+γ22F2+...+γ2mFm+ε2 ... Xj=γj1F1+γj2F2+...+γjmFm+εj ... Xk=γk1F1+γk2F2+...+γkmFm+εk

Xj(j=1,2, ...,k)is the original (standardized) variable Fh(h=1,2, ...,m)denotes the unobserved common factor γj1,γj2, ...,γjmare the factor loadings ofXj on the common factors

εjis the residual or unique (as opposed to common) component. It measures the error committed when the original data are summarized bymfactors

(12)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative

Comments on the variables in the model

ˆ Thestandardizationof the original variables is needed when they are not measured in the same units (and also when they are on very dierent scales). If they are not standardized, the variables with the larger variances would have a greater weight in the estimation method of the factor model.

ˆ The variables must bequantitative. For qualitative variables, dierent methods of data reduction must be applied (correspondence analysis, multidimensional scaling)

(13)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative applications

Model assumptions

Assumptions

ˆ A1. Linearity of the relationship

ˆ A2. E[Fh] =0;Var[Fh] =1;Cov(Fh,Fs) =0

h,s=1,2, ...,m;s6=h

ˆ A3. E[εj] =0; Cov(εj,εt) =0 j,t=1,2, ...,k;t6=j

(14)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative

Comments on assumptions

A1. Linear models are widely used in statistical data analysis A2. Since the factors are not observable, we might as well think of them as measured in standardized form. Being uncorrelated, each factor has its own information content that does not overlap with the information content of the other factors A3. The unique term can be considered as the error term in a linear regression model since it represents that part of an observed variable not accounted for by the common factors. The homoskedasticity is not required

A3 and A4 imply that the correlation between any two observed variables is due solely to the common factors

(15)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative applications

Consequences of assumptions: variances

The variances of the observed variables are functions of:

ˆ the factor loadings (γ- coecients)

ˆ the variances of the unique terms.

Var(Xj) =1=γj21Var(F1) +...+γjm2Var(Fm) +Var(εj2) =

j21+...+γjm2 +Var(εj2) = = m

h=1 γjh2 | {z } communality +Var(εj2) | {z } uniqueness (1)
(16)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative

Communality and uniqueness

Thecommunality of an observed variable is the proportion of its variance that is explained by the common factors.

The larger the communality, the more successful the factor model can be in explaining the variable.

Theuniqueness(or specic variance) is the part of the variance ofXj that is not accounted by the common factors but it's due

(17)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative applications

Consequences of assumptions: covariances

The covariances between the observed variables are only functions of the factor loadings:

Cov(Xj,Xt) =γj1γt1+γj2γt2+...+γjmγtm= m

h=1

γjhγth (2)

The covariances between observed variables and factors are expressed by the factor loadings:

(18)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative

FM in matrix notation

X=FΓ 0 +E (4)

X:(n×k) matrix of k original variables F:(n×m) matrix ofmfactors

Γ:(k×m) rectangular matrix of factor loadings whose generic

element is{γjh}j=1,...,k;h=1,...,m

(19)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative applications

X, F and E matrices

X=      x11 x12 · · · x1k x21 x22 · · · x2k ... ... ... ... xn1 xn2 · · · xnk      = X1 X2 ... Xk (5) E= ε1 ε2 ... εk (6) F=      F11 F12 · · · F1m F21 F22 · · · F2m ... ... ... ... Fn1 Fn2 · · · Fnm      = F1 F2 ... Fm (7)
(20)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative

Γ

matrix

Γ=      γ11 γ12 · · · γ1m γ21 γ22 · · · γ2m ... ... ... ... γk1 γk2 · · · γkm      (8) is the matrix of factor loadings

γjh(j=1, ...,k;h=1, ...,m) is the loading of Xj on Fh. It is a

measure of the correlation between thej-th variable and the h-th factor.

TheΓmatrix tells us which variables are mainly related to the

dierent factors by detecting the strength and the sign of these links.

(21)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative applications

Communalities

Matrix of the squared factor loadings

F1 Fh Fm Communality X1 γ112 γ12h γ12m m ∑ h=1 γ12h Xj γj21 γjh2 γjm2 m ∑ h=1 γjh2 Xk γk21 γkh2 γkm2 m ∑ h=1 γkh2

The sum by row gives the communality. With reference to thej-th row, ∑m

h=1

γjh2 is the communality of Xj, that

(22)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative

Theoretical Variance-Covariance Matrices

In the light of the model assumptions

Σ=Var(X) =ΓΓ

0

+Ψ (9)

Σ:(k×k)var-cov matrix of original variables; symmetric, unit

variances on the main diagonal, covariances o-diagonal

Var(Xj) =1= m

h=1 γjh2+Var(εj2) (10) Cov(Xj,Xt) = m

h=1 γjhγth (11) Ψ:(k×k) var-cov matrix of unique components; diagonal,
(23)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative applications

Observed vs. Theoretical variances

On the one hand we have the observed variances and covariances of theX variables.

The observed var-cov matrix contains k·(k−1)

2 distinct values

(the elements above the diagonal)

On the other, the variances and covariances implied by the factor model.

The theoretical var-cov matrix containskmparameters (only the

factor loadings since the specic variances are functions of them)

The model is useful for reducing the complexity ifkm<k·(k2−1)

(24)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative

Three estimation stages

1 estimating thefactor loadings γjh (initial solution) as well

as the communalities

2 trying to simplify the initial solution through a process

known asfactor rotation. After the rotation the nal factor solution is supposed to be more easily interpreted.

Interpretation is useful to derive a meaningful label for each of the factors

3 estimating thefactor scoresso that these can be used in

(25)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative applications

Model Estimation - First stage

If the model's assumptions are true, we should be able to estimate the loadingsγjh and the communalities so that the

resulting estimates of the theoretical variances and covariances are close to the observed ones.

Most common estimation methods:

ˆ Principal components method

(26)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative

Principal components

The principal component variablesy1,y2, ...,yk are dened to be

linear combinations of the original variablesX1,X2, ...,Xk that

are uncorrelated and account for maximal proportions of the variation in the original data,

i.e.,y1 accounts for the maximum amount of the variance

among all possible linear combinations ofX1,X2, ...,Xk(that is,

it conveys the maximum informative contribution about the original variables)

y2 accounts for the maximum of the remaining variance subject

(27)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative applications

Principal Components Method: Initial Factor

Solution

GivenX:(n×k)matrix ofkoriginal variables andΣ:(k×k)var-cov

matrix of original variables, the rst principal component to be extracted is a linear combination ofXj of the following kind:

y1=v11X1+v12X2+...+v1kXk (12)

or

y1=Xv1 (13)

wherey1 is the(n×1)vector of the values of the rst principal

component v1=     v11 v12 ... v1k    

is the(k×1)vector of the coecients of the linear

combination

v1has to be estimated in such a way that Var(y1) =maxunder the constraintv01v1=1

(28)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative

First principal component

The solution of the constrained maximization problem (that is the vectorv1 that maximizes the variance of the rst principal

component subject to the constraint) is the rsteigenvectorof

Σmatrix. Moreover,Var(y1) =λ1, where λ1 is the rst

eigenvalueof Σ.

It holds that

Σv1=λ1v1 (14)

Since the total variability of the original variables (i.e. the sum of their variances) is equal tok (remember: they are

standardized variables, each one has a variance equal to one), the ratio λ1

k gives the share of total variability that is explained

(29)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative applications

Second principal component

The second principal component isy2=Xv2 wherev2 is

estimated in such a way thatVar(y2) =max under the

constraintsv02v2=1 andCov(y1,y2) =0.

v2 is the second eigenvector ofΣmatrix.

Moreover, Var(y2) =λ2, where λ2 is the second eigenvalue of

Σ.

The ratio λ2

k gives the share of total variability that is explained

(30)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative

i

-th principal component

Thei-th principal component isyi=Xvi where vi is estimated

in such a way thatVar(yi) =max under the constraintsv

0

ivi=1

andCov(yi,yl) =0 (l=1,2, ...,i−1).

vi is thei-th eigenvector of Σmatrix whereas for the

corresponding eigenvalueλi it holds thatVar(yi) =λi.

The ratio λi

k gives the share of total variability that is explained

by thei-th principal component.

The cumulative ratio λ1+λ2+...+λi

k measures the share of total

variability that is explained by the principal components up to

(31)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative applications

Extraction of all the principal components

The method could in principle stop only when the number of extracted components equal the number of initial variables.

Y=XV (15)

where

Y:(n×k) matrix of principal components;

Y= y1 y2 ... yk

V:(k×k) matrix of eigenvectors ofΣ;V= v1 v2 ... vk

(32)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative

Covariance matrix of principal components

L=Cov(Y) =      λ1 0 · · · 0 0 λ2 · · · 0 . . . . . . . .. ... 0 0 · · · λk      (16) whereλ1≥λ2≥...≥λkand k ∑ i=1 λi=k

y1shows the greatest information content, y2shows the second

greatest information content,...

Each principal component brings an information content which is not greater than the one brought by the previous principal component thekprincipal components explain 100% of the original variability However, in order for the method to produce actually a data

reduction, the number of extracted components should be lesser than the original data dimension (m<k).

(33)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative applications

The choice of the number of components to be

retained

The number of principal components can be either directly specied or determined through a statistical/heuristic criterion. In the former case, the estimation can be repeated with a dierent number of components and the solutions can be then compared according to goodness-of-t statistics in order to choose the one that best describes the data.

(34)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative

The choice of the number of components to be

retained

In the latter case, examples of heuristic criteria are:

1 to extract and retain only those components whose

associated eigenvalues exceed one(one is the mean value of the eigenvalues)

2 to retain those components that explain a given share

-usually higher than 70-75% - of the original variability (a 30% loss of variability can be usually accepted against a reduction in the data dimensions)

3 to use the scree plot (the plot of the eigenvalues y axis

-against the order of extraction - x axis); the extraction should be stopped when the plot becomes at (the elbow rule)

(35)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative applications

Reading the FA output

Factor Eigenvalueλi Dierenceλi−λi+1 Proportionλki Cumulative proportion i ∑ j=1λj k 1 5.363 3.789 0.536 0.536 2 1.574 0.248 0.157 0.693 3 1.326 0.439 0.133 0.826 4 0.887 0.347 0.089 0.915 5 0.540 0.332 0.054 0.969 6 0.208 0.132 0.021 0.990 7 0.076 0.055 0.008 0.998 8 0.021 0.016 0.002 1.000 9 0.005 0.005 0.000 1.000 10 0.000 - 0.000 1.000

Based on the rule of eigenvalues greater than the average, three factors may be retained.

(36)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative

Scree plot

(37)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative applications

From principal components to factor loadings

Once we have retained the rstm principal components, Y:(n×m) matrix ofpprincipal components;

Y= y1 y2 ... ym V:(m×m) matrix of eigenvectors ofΣ; V= v1 v2 ... vm L=Cov(Y) =      λ1 0 · · · 0 0 λ2 · · · 0 .. . ... . .. ... 0 0 · · · λm     

(38)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative

Interpretation of factor solution

Factors are articial constructs.

Meaning is assigned to a factor through the subset of observed variables that have high loadings on that factor.

The interpretation of the factors could be an easy task if every one of them was strongly correlated with a limited number of original variables and weakly correlated with the remaining variables (the higher the loadings of a few variables on one factor the more interpretable the factor).

(39)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative applications

Statistical relevance of a factor loading

Rule of thumb:

with a sample size ofn=200units, a reasonable threshold for a

factor loading to be relevant is0.40.

It rises to0.55 withn=100and to 0.75 with n=50.

Usually the initial factors show average correlations with many original variables.

The initial factor solution can then be rotated with the purpose of creating new factors that are associated with few original variables and for this reason are more interpretable than the initial ones.

(40)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative

Aim of the rotation

The factor rotation takes advantage of a property of factor model: there exists an innite number of set of values for the factor loadings yielding the same covariance matrix as that of the original model. Any new set of loadings is produced by a rotation of the initial solution.

Let the initial factor solution represent am−dimension

hyperplane: each original variable corresponds to a point whose coordinates are its loadings on them factors.

With the aim of getting more interpretable factors, the aim of the rotation is to nd new coordinate axes where every point-variable is as close as possible to one of the new axes.

(41)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative applications

Rotation of the Initial Factor Solution

Orthogonal vs. oblique rotation

ˆ Orthogonal rotation methods: the factors remain mutually uncorrelated

(42)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative

Orthogonal rotation methods

ˆ Varimax methodensures that only one or a few observed variables have large loadings on any given factor. The aim is to maximize the variability of the columns of the initial loading matrix. The rotated factor loadings will be very close either to one (in absolute value) or to zero, which facilitates the matching of the variables to a given factor

ˆ Quartimax methodensures that each variable has large loadings only on one or a few factors. The objective is to maximize the variability of the rows of the initial loading matrix. Several variables may result strongly related to the same factor

ˆ Equamax method (something in between the two previous methods)

(43)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative applications

From factor loadings to factor scores

LetΓ0=ΣV0L−1/2 indicate the rotated loading matrix.

The matrix of factor scores is then derived asF=XV0L−1/2.

The principal components after the rotation are rescaled in order for them to have unit variance

(44)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative

Example 1 - Supermarkets - Factor Solution

Rotated Factor Solution

Items F1(setting) F2(position) F3(price) Communality

Convenience in going to score 0.139 0.845 0.025 0.734

Product price 0.084 0.178 0.834 0.734

Store location 0.076 0.873 0.059 0.771

Sales promotion 0.269 0.094 0.764 0.665

Width of aisle in the store 0.841 0.037 0.122 0.723 Store athmosphere and decoration 0.830 0.114 0.016 0.702

Store size 0.791 0.123 0.062 0.645

% of variance 30.378 22.085 18.595 Cumulative % of variance 30.378 52.463 71.058

(45)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative applications

Example 1- Use of Factor Solution and Results

The three factor scores resulting from the factor analysis are then used as independent variables for a logit regression analysis.

Dependent variable: Store Preference (Binary Choice: e.g. Supermarkets in a Department store vs. Stand-alone Supermarkets)

The results can be used to elaborate management strategies: when interested in expanding supermarket outlets in department stores, the factors which most inuence the probability of preferring the department stores should be the primary focus.

(46)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative

Example 2 - Local Products - Factor Solution

Factor Eigenvalueλi Dierenceλi−λi+1 Proportionλki Cumulative proportion i ∑ j=1λj k 1 5.484 3.789 0.323 0.323 2 1.964 0.248 0.115 0.438 3 1.557 0.439 0.092 0.530 4 1.257 0.347 0.074 0.604 5 1.083 0.332 0.064 0.668 6 0.798 0.132 0.047 0.715 7 0.793 0.055 0.047 0.762 8 0.681 0.016 0.040 0.802 ... ... ... ... ...

5 factors explaining 66.8% of the total variance were extracted that represent the key consumption dimensions

(47)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative applications

Example 2 - Rotated Factor Loadings

Factor 1: Topicality

Original variables Factor Loading Production methods 0.824 Appearance of a special label 0.725 Products with chemical adds 0.677 Help to the local economy 0.650

Price 0.575

High value 0.562

Factor 2 : Quality and Health Issues

Original variables Factor Loading

Quality 0.832

Health protection 0.703 Environmental protection 0.680 Nutrition value 0.459

(48)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative

Example 2 - Rotated Factor Loadings

Factor 3: Appearance

Original variables Factor Loading

Appearance 0.877

Attractiveness of product's packing 0.834

Factor 4: Freshness and Taste Issues

Original variables Factor Loading Freshness of the product 0.723

Taste of the product 0.612 Interest about the product being clean 0.570

Factor 5: Curiosity and Prestige

Original variables Factor Loading Curiosity 0.862

(49)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative applications

Example 2 - Input for a segmentation analysis

By replacing the original 17 variables with the 5 factors a segmentation analysis has been performed (through cluster analysis) with the aim of identifying homogeneous groups of consumers. Two groups result that have been named according to their behaviour patterns towards local products as

ˆ Consumers inuenced by curiosity, prestige and freshness of the product as well as by marketing issues (attractiveness of the packing of the product, the appearance of the product generally)

ˆ Consumers interested in the topicality of the product, in product's certication and environment protection. They pay attention to the ingredients of the product as well as to its price

(50)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative

Observed data - Beach resorts

The following illustrative applications are based on: Bracalente B., Cossignani M., Mulas A. (2009), Statistica aziendale, Mc-Graw Hill

On a sample of beach resorts, the price of several beach facilities has been observed

Variable name Description

bed_d Bed per day

chair_d Chair per day

umb2beds_d Umbrella and two beds per day

bed_a Bed (only afternoon)

bed_w Bed per week

umb+2beds_w Umbrella and two beds per week paddle_h Paddle boat per hour

(51)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative applications

FA output

Factor Eigenvalueλi Dierenceλi−λi+1 Proportionλki Cumulative proportion i ∑ j=1 λj k F1 4.351 3.287 0.622 0.622 F2 1.064 0.443 0.152 0.774 F3 0.621 0.002 0.089 0.862 F4 0.619 0.432 0.088 0.951 F5 0.187 0.066 0.027 0.978 F6 0.121 0.084 0.017 0.995 F7 0.037 - 0.005 1

The rst two eigenvalues are greater than one. The corresponding factors explain 77.4% of the original variability.

(52)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative

Scree Plot

(53)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative applications

Loading matrix - Initial solution

Variable F1 F2 Communality bed_d 0.9588 0.1094 0.95882+0.10942=0.9313 chair_d 0.9251 0.0831 0.92512+0.08312=0.8627 umb2beds_d 0.8662 -0.3390 0.86622+ (−0.3390)2=0.8652 bed_a 0.7799 0.1148 0.77992+0.11482=0.6214 bed_w 0.7684 0.0482 0.76842+0.04822=0.5928 umb+2beds_w 0.7492 -0.3277 0.74922+ (0.3277)2=0.6686 paddle_h 0.2567 0.8987 0.25672+0.89872=0.8735 For all the observed variables, the proportion of variance accounted for by the common factors (the communality) is very high, from 59.3% to 93.1%. The rst factor is positively related to the prices of beds, umbrellas and chair. The second factor accounts for the price of paddle boat

(54)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative

(55)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative applications

Loading matrix after rotation

Variable F1 F2 bed_d 0.9198 0.2917 chair_d 0.8919 0.2594 umb2beds_d 0.9152 -0.1661 bed_a 0.7432 0.2626 bed_w 0.7448 0.1950 umb+2beds_w 0.7982 -0.1775 paddle_h 0.0792 0.9313

After the rotation, the rst factor shows strong (positive) correlations with the rst six original variables. The second factor is strongly associated with the last variable

(56)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative

(57)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative applications

Retailer Customers

A retailer asks a sample of customers about their monthly income and consumption expenditure (in thousands of euro) and their opinion (score from 0 to 10) on three sections of the store (meat, sh and frozen food).

Can the ve original variables be summarized by a smaller number of factors?

How many factors are needed and what percentage of the original variability they explain?

How can the resulting factors be interpreted?

Factor Eigenvalueλi Dierenceλi−λi+1 Proportionλki Cumulative proportion i ∑ j=1 λj k F1 3.0217 1.7060 0.604 0.604 F2 1.3157 0.9266 0.263 0.868 F3 0.3891 0.1263 0.078 0.945 F4 0.2628 0.2521 0.053 0.998 F5 0.0107 - 0.002 1

(58)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative

Loading matrix - Initial solution

Variable F1 F2 Communality Unexplained

income 0.7911 -0.6016 0.9877 0.0123

consumption 0.7869 -0.6087 0.9896 0.0104

q_meat 0.7768 0.4035 0.7662 0.2338

q_sh 0.6691 0.5735 0.7766 0.2234

(59)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative applications

Loading plot - Initial solution

(60)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative

Loading matrix after rotation

Variable F1 F2 Communality Unexplained

income 0.1683 0.9795 0.9877 0.0123

consumption 0.1604 0.9818 0.9896 0.0104

q_meat 0.8433 0.2346 0.7662 0.2338

q_sh 0.8805 0.0369 0.7766 0.2234

(61)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative applications

Loading plot after rotation

(62)

Factor Analysis Introduction Model Specication Factor Model Assumptions Matrix notation Model estimation Initial Factor Solution PC Method Rotation Factor scores Marketing Applications Further illustrative

Bartholomew D.J. (1987), Latent Variable Models and Factor Analysis, Charles Grin & Company Ltd., London. Bracalente B., Cossignani M., Mulas A. (2009), Statistica aziendale, Mc-Graw Hill

Tryfos P. (1998), Methods for Business Analysis and Forecasting: Text and Cases, John Wiley & Sons.

References

Related documents