CA Biplot Variants - The CA Biplot - A Theoretical Overview of GDA Methods

Chapter 5: A Theoretical Overview of GDA Methods

5.4 The CA Biplot

5.4.1 CA Biplot Variants

The various types of CA biplots are discussed in great detail in Chapter 7 of Understanding Biplots (Gower et al., 2011:289). Provided below is an explanation of the several variants of the independence model that can be approximated in the CA biplot.

5.4.1.1 Pearson’s chi-squared:

By making use of the weighting the independence model, the contributions to the Pearson’s chi- squared statistic can be approximated. Thus following from equation 5-17 the model which must be minimised is:

‖𝑹−½_{(𝑿 −}𝑹𝟏𝟏′𝑪 𝑛 ) 𝑪

−½_{− 𝑿}_̂‖2_, (5-26)

where 𝑿̂ is the 𝑘-dimensional approximation of 𝑹−½(𝑿 −𝑹𝟏𝟏′𝑪 𝑛 ) 𝑪

−½_{(Gower et al., 2011:291). Note} that the SVD of 𝑿̂, the 𝑟-dimensional approximation of 𝑿, is written as 𝑼𝑱𝚺𝑽′.The 𝑟-dimensional approximation of the 𝜒2 contributions requires plotting of the first 𝑟 columns of 𝑼𝚺½ and 𝑽𝚺½. Such biplots provide a display from which it is possible to identify the elements of 𝑿̂ which contribute most to the total inertia as well as which element diverge from the assumption of independence.

5.4.1.2 Approximating the deviations from independence:

This method seeks to approximate the deviations from independence, 𝑿 − 𝑬, which are weighted by the inverse square roots of the row and column totals. This translates to the minimization of:

‖𝑹−½_{{(𝑿 −}𝑹𝟏𝟏′𝑪

𝑛 ) − 𝑿̂} 𝑪−½‖

(5-27)

(Gower et al., 2011:291). Equation 5-27 is minimized by 𝑹−½𝑿̂𝑪−½= 𝑼𝚺𝑱𝑽′. Solving for the 𝑟- dimensional approximation of 𝑿 gives 𝑿̂ = 𝑹½𝑼𝚺𝑱𝑽′𝑪½. By plotting the first 𝑟 columns of 𝑹½𝑼𝚺½ and 𝑪½𝐕𝚺½ a biplot which represents the deviations away from independence is achieved.

5.4.1.3 Approximation to the contingency ratio:

The contingency ratio, as defined in Correspondence Analysis in Practice (Greenacre, 2007:101) is the ratio of observed proportions of a contingency table to the expected proportions. A CA biplot which approximates the contingency ratios is defined as the minimization of:

‖𝑹−½_{𝑹−1_{(𝑿 −}𝑹𝟏𝟏′𝑪

𝑛 ) 𝑪−1− 𝑿̂} 𝑪−½‖

2 _(5-28)

(Gower et al., 2011:292). In equation 5-28 the matrix 𝑿̂ approximates 𝑹−1(𝑿 −𝑹𝟏𝟏′𝑪 𝑛 ) 𝑪

−1₌ 𝑹−½_𝑼𝚺𝑽′_𝑪−½_{. Note that a contingency ratio of unity is desirable, and the closer the value is to unity} the closer the data are to independence (Greenacre, 2007:101). By plotting the first 𝑟 columns of 𝑹−½_𝑼𝚺½_and_𝑪−½_𝑽𝚺½_{a biplot which depicts the deviation from the unity contingency ratio is} obtained. Note that the biplot does not provide a direct deviation from unity but rather the deviation divided by 𝑛. This characteristic of the biplot has no material influence on the biplot but must be accounted for when calibrating the axes.

5.4.1.4 Approximation to the Chi-squared distance:

The independence assumption is evaluated by the 𝜒2 statistic. For any two-way table, the assumption can be tested via the 𝜒2 distance between rows or the 𝜒2 distance between the columns of the table. Therefore, there are two possible solutions for when approximating the Chi-squared distances in a contingency table.

The chi-squared distances between the 𝑖th and 𝑖′th rows of 𝑿 are defined as: 𝑑_𝑖𝑖2′ = ( 𝒙_𝒊 𝑥𝑖.− 𝒙_𝒊′ 𝑥𝑖′_.) ′ 𝑪−1₍𝒙𝒊 𝑥𝑖.− 𝒙_𝒊′ 𝑥𝑖′_.) (5-29)

(Gower et al., 2011:293). Equation 5-29 is of the form of weighted Euclidean distances between the rows of 𝑿. The chi-squared distances are then calculated as the normal Euclidean distances between the rows of 𝑹−1𝑿𝑪−½. This can also be expressed as 𝑹−1𝑿𝑪−½−𝟏𝟏′𝑪½

𝑛 , where 𝟏𝟏′_𝑪½

𝑛 is the translation term. The translation term has no material effect on the chi-squared measures between rows of 𝑿 but translates the points about the centroid in the biplot. The translated chi-squared distances can then be written as:

𝑹−1_{(𝑿 −}𝑹𝟏𝟏′𝑪

𝑛 ) 𝑪−½= 𝑹−1(𝑿 − 𝑬)𝑪−½

(5-30)

(Gower et al., 2011:294). The biplot of the chi-squared distances between the rows of matrix 𝑿 is then based upon the minimization of:

‖𝑹½_{𝑹−1_{(𝑿 −}𝑹𝟏𝟏′𝑪 𝑛 ) 𝑪

−½_{− 𝑿}_̂}‖2_, (5-31)

‖𝑼𝚺𝑽′_{− 𝑹}½_𝑿_̂‖2_. (5-32) The approximation, 𝑿̂, is then obtained from the inner product of 𝑹−½𝑼𝚺𝑽′ and the 𝑟-dimensional biplot is obtained by plotting the first 𝑟 columns of 𝑹−½𝑼𝚺 for the row points and 𝑽 for the column points (Gower et al., 2011:294). Note that 𝑽 is an orthogonal matrix and has no effect on the distances between the row coordinates.

Similar arguments follow for the chi-squared distances between the columns of 𝑿. The chi-squared distances between the 𝑗th and 𝑗′th columns of 𝑿 are defined as:

𝑑_𝑗𝑗′2 _{= (}𝒙𝑗 𝑥.𝑗− 𝒙_𝑗′ 𝑥_.𝑗′) ′ 𝑹−1₍𝒙𝑗 𝑥.𝑗− 𝒙_𝑗′ 𝑥_.𝑗′). (5-33)

This leads to a biplot which attempts to minimise: ‖{𝑹−½_{(𝑿 −}𝑹𝟏𝟏′𝑪

𝑛 ) 𝑪−1− 𝑿̂} 𝑪½‖ 2

, (5-34)

which simplifies to:

‖𝑼𝚺𝑽′_{− 𝑿}_̂𝑪½_‖2_. (5-35)

The approximation, 𝑿̂, is then obtained from the inner product of 𝑼𝚺𝑽′𝑪−½ and the 𝑟-dimensional biplot is created by plotting the first 𝑟 columns of 𝑪−½𝑽𝚺 for the row points and 𝑼 for the column points. Similarly, 𝑼 is also an orthogonal matrix which has no effect on the distances between the points representing the columns of 𝑿̂.

It is common practice to plot both the row and column chi-squared distances in one single plot, although the relationships between these two measures has no meaningful interpretation. This plot is simply achieved by plotting 𝑹−½𝑼𝚺 and 𝑪−½𝑽𝚺 simultaneously as two sets of points.

5.4.1.5 Canonical Correlation Approximation:

Canonical correlation is associated with the optimal CA solution. The optimal solution can be viewed as a function of the correlation between the rows and columns of the two-way table. When the CA solution is aimed to maximize the correlation, then the name of the corresponding correlation for the optimal solution is known as canonical correlation. In this variant of the CA biplot there will be a search for the optimal row and column scores such that the correlation between rows and columns is maximised. Due to its lengthy derivation, the explanation of said method is not provided for. Rather, the reader is referred to Chapter 7 of Understanding Biplots for a detailed explanation (Gower et al., 2011:296).

5.4.1.6 Approximating the row profiles:

The final variant of the CA biplot is centred upon the row profiles, 𝑹−1𝑿, of the two-way table. Similarly, the approximation of the column profiles can be achieved by simply replacing the input matrix with the transpose of 𝑿. In this case the suggested biplot model is:

‖𝑹½_{𝑹−1_{(𝑿 −}𝑹𝟏𝟏′𝑪

𝑛 ) − 𝑿̂} 𝑪

−½_‖2_, (5-36)

and the approximation 𝑿̂ takes the form of 𝑹−½𝑼𝚺𝑽′𝑪½ (Gower et al., 2011:298). The accompanying biplot is obtained by plotting the first 𝑟 columns of 𝑹−½𝑼𝚺 for the rows and 𝑪½𝑽 for the columns. The distances expressed in this biplot are chi-squared distances, and the points for the columns provide axes for the row profile points. Here the row profiles are not the pure row profiles but rather the deviations from the marginal row profile, 𝟏

′_𝑪 𝑛 .

In document An application of geometric data analysis techniques to South African crime data (Page 68-71)