GRAPHICAL METHOD OF FACTORING THE CORRELATION MA TRIX
BY L. L. THURSTONE
THE PSYCHOMETRICLABORATORY, THEUNIVERSITY OFCHICAGO CommunicatedApril 14, 1944
Multiple factor analysis starts with a square matrix R of order n X n with cell entries
rjk
which are experimentally determined. The matrix R shows the correlation of each variable with every other variable in the test battery. The correlation matrix is symmetric sincerjk
= rki. Some students write unity in the diagonal cells and this is legitimate for some problems. More often the diagonals contain the communalities hk2which are sochosen as to be consistent with the rank of the matrix as determined by the side entries. The factoring methods to be described can be applied for any set of diagonal entries.IThe first object of multiple factor analysis is to find a factor matrix F of order n X r where r is the rank of the correlation matrix
Rjt.
The factor matrix F must be written so as to satisfy the fundamental relationFF' =R. The correlation matrix R can be factoredinto a factor matrix F in many different ways and F is not here unique. The current geometrical inter-pretation of R is that its entriesrjk
showthe scalar products ofpairs of test vectorsJ andK which are not necessarily of unit length unless this restric-tionis imposed by unit diagonalsrjj. Since the correlation coefficients are scalar products, it follows thatno reference frame isimplied by R. Each rowj of the factor matrix F shows the entriesaim whicharethe projections of the test vector J on a set ofr orthogonal reference axes. The lack of uniqueness of Fisgeometricallyinterpretedin the free choice ofan orthog-onal reference frame for the test configuration of n test vectors J. The reference frame is explicitly defined in F relativetothetest vectorsbutit is notdefined in R.One factorial solution for Fis to locate the first reference axis so as to
maximizethesumof squaresoftestvectorprojections. The second
refer-enceaxiscanbe locatedbythesamecriterionappliedtofirstfactor
correla-tionresiduals, andso onuntil aset of rorthogonal axeshave been
deter-minedwhichcorrespondtorankrofthecorrelation matrix. Inthewriter's
original formulation of thisproblem, it wascalled theprincipal axes solu-tion.2 In the first paper on this problem a computational method was
described forfindingtheprincipal axessolution F. Itinvolvedfirst
factor-ing the correlation matrix intoafactormatrix Fwith anyarbitrary orthog-onal reference frame and a subsequent
orthogonal
transformation repre-senting arigidrotation to theprincipalaxes. That solution involved theroots. The next yearHotellingdescribed an iterative method of factoring thecorrelation matrixdirectly into the principalaxessolutionwhich he
re-named principal components.3 Since theiterative methods, as well asall other available methods, of obtaining the principal axes solution are very laborious, it is of interest to students of multiple factor analysis to find shortermethods.
Consider the first column of F for the principal axes solution. Its ele-ments areaji. Itis desiredtofind the values of
aji
so thatthefirstfactor shall account fora maximumof the variance in the correlation matrix R. Thenrjk - ajlakl = Pjk, (1)
where
aji
andaklarethe first factor saturations of testsj and k, andPjk iS the residual to be minimized. Squaring and summing for all correlation coefficients, we have22rjk- 22Zaj1aklrjk + 22aj12ak12 =
22Pjk2
Z, (2)jk jk jk jk
where the sum of the squares of thecorrelation residualsPikmaybedenoted
z. Thepartialderivative with respectto
aji
is then__ =
-4Zrjkakl
+4aj12akl2
(3)fromwhich we get
aji2aki2 =2rjkakl. (4)
k k
Let the sum Za2 = t. The summation 2ra can be written as a matrix product Ra. Then
Ra = at (5)
where R is thegivensquare matrix,ais acolumn vector, and tisa scalar, namely, the largest latent root. The iterative process consists in starting withatrialvector u so that
Ru = vZu2 (6)
where (v2u2),ornumbersproportionaltothese,constitute the trialvector u
forthenext iteration. The process continues untilv2u2becomes propor-tionaltothetrial valuesu atwhichtime the values ofv (= u) are the de-sired valuesof
aji
and thefirst column of F is then determined. Thenum-ber of iterations can be considerably reducedby starting with R2, or R4, orevenR8. While that looks effective inatextbookexample,it is not so
inviting aprocedure whenwe areconfronted with a correlation matrix of order 70 X 70.
Consider the matrixmultiplicationRuinequation(6)and, inparticular, the scalar product of the rowj of
Rjk
and the column vector u. This productisRju
= VJZU2 (7)andthis can be written brieflyas
Rju
(8)butthis is also the well-known
equation
fortheslopeof the regression liner on u throughthe
origin
for aplot
of ragainst
u. Ifwe plotcolumns of rjk againstthe trial values Uk,wegetaplot
withnpoints. The best fitting straightlinethroughtheorigin (regression
ronu)
canbedrawnby inspec-tion and the slope of the line iseasily
read graphically with sufficient accuracy for the first few iterations. Thatslope
is the value vj. Toplot, say, 30 points and to drawby
inspection the best fitting straight line through theorigintakesless time thantoaddcumulatively30 products ona calculatingmachine.
By equations (5) and
(6)
it is evident that at the solution thevalues
vj
=uj
= aj. Whena setofvalues ofvhasbeendeterminedgraphically, we plotv againstuand find theslope
m. In the first fewgraphical itera-tions, thisslopewillnotbeunity
because thetrial valuesuareprobablytoo largeor toosmall. If thetruevaluesarebetterrepresented by ku,inwhich k is astretchingfactor,
then aplot
of ragainst
ku willgive
aslope ofv/k
instead ofv. Whenthesegraphs
have beendrawn,
theslopeofv/kagainst kushouldbeunity. Butthe obtainedslope
ofvagainst
uismfor the values actuallyused. Hencetheobservedslope
m = k2.Having
found theslopemof the
plot
ofvagainst
u,wefindk = x/rn. Then ifweshould take anewsetoftrial numbers xj =
kuj,
the iteration wouldgive,
instead ofequa-tion (8),
Rx
zx2 = yy
'(9)
where y = v/k. The
slope
of yagainst
x should nowbeunity
and the valuesof y shouldbe usedasthe trialvectorforthenextgraphical
iteration. The determinationof m and kcan be done with a sliderule and the new trialvaluesy = V/k canalso be determinedby
the sliderule. Theslopemcan befound eitherbyinspection orby simple summing with the method of averagesin whichm =
lv/lu
for likesigned
pairs
ofvandu.Thegraphical method here described forthe
principal
axes solution has alsobeenadaptedforacentroid method offactoring
thecorrelation matrix by whichthecomputationsforafactor matrix canbe doneby
a computerwithout excessive labor even for a large matrix and without the use of tabulating machine equipment. That procedure is considerably faster
and it will be described in asubsequent paper.
A numerical example is here given. In table 1 we have a correlation matrix R of order 10 X 10 and rank 3. Thecorresponding factor matrix F for the principal axes is shown in the same table. Itis the solution to be foundfrom R. The latent roots of R are 2.85, 1.47 and 0.63. These are the sums of squares of columns of F. In table 2 theinitial trial vector u for themajor principal factor is taken roughly proportional to the sums of columns of R. Column1 of Risplotted against columnuand the slope is estimated to be approximately +0.05 and it is recorded in column vi. Ten such plots give the values in column
vi.
The slope is read directly from each graph by noting the ordinate of a straight line fit at ul = 1. Nextobtain thetwo columnsumsMu, andZvi. The ratio2v/2u = mandk = V/m. This summationcanbe absolute sums for like signed pairs ofu
andv. Then computey, = vi/kwithaslide rule. Thesearealso the trial values u2 for thenexttrial. Proceedlikewise for three trials which give the desired values of
aj,
to twodecimals.The first factor residuals are then computed. These are Pjk =
r-ajlakl.
Chooseas astarting vector that column of the first factor residuals which has thelargest absolute sum, ignoring diagonals. This is column 1.Itsvalues arerecorded incolumnu1forfactor two. The procedure is now the sameasbefore. The thirditeration was taken on a calculating machine andit gave the second factor loadings
aft,
the second column of F, to twodecimals. Thelast iterations for each factorcan be doneon acalculating machinetoobtain greater accuracywhile the first few iterationscanbe done graphically to save time and labor. The computation for the third column of Fwasdonein the samemanner.
Occasionally, when two or three latent roots are nearly the same, the iterative process Will be foundto oscillateor the convergence will be slow. This is an indication that the test configuration has nearly equal thick-nesses in these dimensions, so that the configuration is nearly circular or spherical inthesedimensions. If the purpose isto extractthe maximum variance from the correlation
matrix,
it doesnotmatterwherewe place a setoforthogonalaxesin those dimensions whicharerepresentedbynearly
equal latentroots. Insuchasituation it is desirabletoreduce theobtained
vector v to a unitvector in the system withoutwaitingfor complete
con-vergence. Thatcanbedonewithastretching factor pon vsothat pv =a,
whereaarethedesired factorloadings. Thevalue of pcanbe found from the relation
1/p
= V2YBy
this device thecomputer
need not beunduly delayed byslow convergenceinobtainingafactoriallyuseful solu-tion.
. .* . . . . 00000000000 CO0 COC0cqO co co
oqoqoq
o o q oo o oo o o o too o ) 5 c CI°SOOo C '- m10COt'-00o oItoto o0 c c Co 0 Co o 5 5C 56C o 1 ** co ooooooooo 11 CO eo V-4 CSN uCQ CQN ' * * * * * . . .6* ;h, I I I 040CIS10o 555ot55 6CD C; C; C o o o o oo'
io o co o I 5- 55Do c 0 of -7000 2 11 U-'4 '-4000W0 t-0 t- dv c o 0> ,C ;C;C ;C Ci O O o oooCOIu: o10 0oOee :O0
, cq eq C OCO o555555555c8 44~ ~ ~ ~ ~ ~ ~ -0 C.O 0) C aC COo o CO C o0,-I I I I I I I44 ;. - 0 0to'4 CO rH CC o o o o o o o oo CO I p C I C M Cy k O b s m
°°-
I
I°
ci r -4 a CO CO H'-4 o-4 '-4 55a1° °
0C CO CS C t I ¢I 4-4; 0 0 0 0) '-4 '4 '-4 '-4 '-4 H c- 410 O oO.. ...5 .,ooooo 1 o o o Oa o ato
o a so o o COo o o o I 5566 6 *04 ,JH o co otCO CO C o o CCCO d4 1CO COO ''Oo o t H -4 co " c'oo.
. .. . . I V-4 0 0 0 a 0 0 0 0 a 0 0 010'0 0 * * 1a .a * * 0 000066666o; t _-4 CO t °°00 044 0. .4 U t'. 0 0 cl U.: s, * 00O CO . 00 I^a C)2
00 co uDCY *e8
Wm-V-4 N m v ko 0 t-m0 0 1-4 P4 a weThe graphical methodhere described for estimating inner products can beappliedinavariety ofcomputational problems.
1The author wishes to express his indebtedness to the Carnegie Corporation of
NewYork foraresearch grantinsupport ofourdevelopment of multiple factor analysis andits application to the study of primary mental abilities.
2L.L.Thurstone, "Theoryof Multiple Factors," 1932, pp. 17-27, Edward Brothers, AnnArbor, Mich.
8Harold Hotelling, "Analysisof a complex of statistical variables into principle com-ponents," J.Educ. Psych., 24, 417-441, 498-520,(1933).
THE DEVELOPMENT OF NORMAL AND HOMOZYGO US BRA CHY
(T/T) MOUSE EMBRYOS IN THE EXTRAEMBRYONIC COELOAI OF THE CHICKt
ByS. GLUECKSOHN-SCHOENHEIMER*
DEPARTMENT OFZOOLOGY, COLUMBIA UNIVERSITY
Communicated April 15, 1944
The study of the causal morphology of mammalian embryos in spite of the greatinterest attached to it hasnot progressed very far because of the technical difficulties with which any experimental approach to the problem has met. Waddington and Waterman1 grewrabbit embryos in vitro for a-limited time and Nicholas and Rudnick2 devised a method for raising rat embryosin a culture medium. In spite of these attempts, it
was not possible to operate on mammalian embryos in early stages and let them continue their development inside the uterus, nor could such embryos be raised and develop normally in a suitable extra-uterine
me-dium for anylengthoftime. Methods which would accomplish this would be of importance not only for the study of normal causal morphology of mammalian embryos, but also for anexperimental study of the embryog-eny of certain hereditary abnormalities.
Searching
for such a method, a newprocedure forraising entire mouse embryosoutside the uteruswasdeveloped and described in detail
(Gluecksohn-Schoenheimer).3
Itcon-sisted in removing the embryosfrom the uterusof the mother and
trans-planting them into theextra-embryoniccoelom ofthechickembryowhere they remained and developed for one or several days. This method has been applied in the experiments reported here to the study of
embryos
homozygous for theBrachyurymutation TT.
AsdescribedbyChesley4thehomozygousmutants
(T/T)
showextrememorphological abnormalities from the age of about 8 days on and die at
about 10days. Theposterior
body
region,
including posterior
limbbuds,
ismissingcompletely andextensive abnormalities are found in