Structural Equation Modelling for small samples

Full text

(1)Structural Equation Modelling for small samples Michel Tenenhaus HEC School of Management (GRECHEC), 1 rue de la Libération, Jouy-en-Josas, France [[email protected]]. Abstract Two complementary schools have come to the fore in the field of Structural Equation Modelling (SEM): covariance-based SEM and component-based SEM. The first approach developed around Karl Jöreskog. It can be considered as a generalisation of both principal component analysis and factor analysis to the case of several data tables connected by causal links. The second approach developed around Herman Wold under the name "PLS" (Partial Least Squares). More recently Hwang and Takane (2004) have proposed a new method named Generalized Structural Component Analysis. This second approach is a generalisation of principal component analysis (PCA) to the case of several data tables connected by causal links. Covariance-based SEM is usually used with an objective of model validation and needs a large sample (what is large varies from an author to another: more than 100 subjects and preferably more than 200 subjects are often mentioned). Component-based SEM is mainly used for score computation and can be carried out on very small samples. A research based on 6 subjects has been published by Tenenhaus, Pagès, Ambroisine & Guinot (2005) and will be used in this paper. In 1996, Roderick McDonald published a paper in which he showed how to carry out a PCA using the ULS (Unweighted Least Squares) criterion in the covariance-based SEM approach. He concluded from this that he could in fact use the covariance-based SEM approach to obtain results similar to those of the PLS approach, but with a precise optimisation criterion in place of an algorithm with not well known properties. In this research, we will explore the use of ULS-SEM and PLS on small samples. First experiences have already shown that score computation and bootstrap validation are very insensitive to the choice of the method. We will also study the very important contribution of these methods to multiblock analysis. Key words: Multi-block analysis, PLS path modelling, Structural Equation Modelling, Unweighted Least Squares Introduction Compare to covariance-based SEM, PLS suffers from several handicaps: (1) the diffusion of path modelling softwares is much more confidential than that of covariance-based SEM softwares, (2) the PLS algorithm is more an heuristic than an algorithm with well known properties and (3) the possibility of imposing value or equality constraints on path coefficients is easily managed in covariance-based SEM and does not exist in PLS. Of course, PLS has also some advantages on covariance-based SEM (that’s why PLS exists) and we can list some of them: systematic convergence of the algorithm due to its simplicity, possibility of managing data with a small number of individuals and a large number of variables, practical meaning of the latent variable estimates, general framework for multi-block analysis.. 1.

(2) It is often mentioned that PLS is to covariance-based SEM as PCA is to factor analysis. But the situation has seriously changed when Roderick McDonald showed in his 1996 seminal paper that he could easily carry out a PCA with a covariance-based SEM software by using the ULS (Unweighted Least Squares) criterion and cancelling the measurement error variances. Furthermore, the estimation of the latent variables proposed by McDonald is similar to using the PLS mode A and the SEM scheme (i.e. using the “theoretical” latent variables as inner LV estimates). Thus, it became possible to use a covariance-based SEM software to mimic PLS. In the first section of this paper, it is reminded how to use the ULS criterion for covariance-based SEM and the PLS way of estimating latent variables for mimicking PLS path modelling. Then, the second section is devoted to show how to carry out a PCA with a covariance-based SEM software and to comment the interest of this approach for taking into account parameter constraints and for bootstrapping. Multi-block analysis is presented in the third section as a confirmatory factor analysis. We have used AMOS 6.0 (Arbuckle, 2005) and XLSTAT-PLSPM, a module of the XLSTAT software (XLSTAT, 2007), on practical examples to illustrate the paper. Listing the pluses and minuses of ULS-SEM and PLS finally concludes the paper. I.. Using ULS and PLS estimation methods for structural equation modelling. We describe in this section the use of the ULS estimation method applied to the SEM parameter estimates and that of the PLS estimation method for computing the LV values. In a first part we remind the structural equation model following Bollen (1989). A structural equation model consists of two models: the latent variable model and the measurement model. The latent variable model Let η be a column vector consisting of m endogenous (dependent) centred latent variables, and ξ a column vector consisting of k exogenous (independent) centred latent variables. The structural model connecting the vector η to the vectors η and ξ is written as. η = Bη + Γξ + ζ. (1). where B is a zero-diagonal m × m matrix of regression coefficients, Γ a m × k matrix of regression coefficients and ζ a centred random vector of dimension m. The measurement model Each latent (unobservable) variable is described by a set of manifest (observable) variables. The column vector yj of the centred manifest variables linked to the dependent latent variable ηj can be written as a function of ηj through a simple regression with usual hypotheses. y j = λ yjη j + ε j. (2). The column vector y, obtained by concatenation of the yj’s, is written as. y = Λy η + ε. (3) m. where Λ y = ⊕ λ yj is the direct sum of λ1y ,..., λ my and ε is a column vector obtained by concatenation j=1. of the εj’s. It may be reminded that the direct sum of a set of matrices A1, A2,…, Am is a block diagonal matrix in which the blocks of the diagonal are formed by matrices A1, A2,…, Am. 2.

(3) Similarly, the column vector x of the centred manifest variables linked to the latent independent variables is written as a function of ξ: (4). x = Λ xξ + δ. Adding the usual hypothesis that the matrix I-B is non-singular, equation (1) can also be written as: η = (I - B) −1 (Γξ + ζ). (5). and consequently (3) becomes y = Λ y [(I - B)−1 (Γξ + ζ)] + ε. (6). Factorisation of the manifest variable covariance matrix. Let Φ = Cov(ξ) = E(ξξ’), ψ = Cov(ζ) = E(ζζ’), Θε = Cov(ε) = E(εε’) and Θδ = Cov(δ) = E(δδ’). Suppose that the random vectors ξ, ζ, ε and δ are independent of each other and that the covariance matrices Ψ, Θε, Θδ of the error terms are diagonal. Then, we get: Σ xx = Λ xΦΛ x ' + Θδ , Σ yy = Λ y [(I - B) −1 (ΓΦΓ' + Ψ )][(I - B) ']−1 Λ y ' + Θε ,. Σ xy = Λ xΦΓ ' [ (I - B) '] Λ y ' −1. From which we finally obtain: (7) ⎡ Σ yy Σ=⎢ ⎣ Σ xy. Σ yx ⎤ ⎡ Λ y [(I - B) −1 (ΓΦΓ' + Ψ )][(I - B) ']−1 Λ y ' + Θε =⎢ Σ xx ⎦⎥ ⎢ Λ xΦΓ ' ⎣⎡(I - B) −1 ⎦⎤ ' Λ y ' ⎣. Λ y [ (I - B) ] ΓΦΛ x '⎤ ⎥ Λ xΦΛ x ' + Θδ ⎥⎦ −1. Let θ = {Λ x , Λ y , B, Γ,Φ, Ψ, Θε , Θδ } be the set of parameters of the model and Σ(θ) the matrix (7). Model estimation using the ULS method. Let S be the empirical covariance matrix of the MV’s. The object is to seek the set of parameters ˆ ,Λ ˆ , Bˆ , Γ,Φ, ˆ ,Θ ˆ ,Θ ˆ } minimizing the criterion ˆ ˆ Ψ θˆ = {Λ x y ε δ (8). ˆ S − Σ(θ). 2. The aim is therefore to seek a factorisation of the empirical covariance matrix S as a function of the parameters of the structural model. In SEM softwares, the covariance matrix estimations n (ε) and Θ n (δ) of the residual terms are computed in such a way that the diagonal of ˆ = Cov ˆ = Cov Θ ε δ ˆ is null, even when it yields to negative variance the reconstruction error matrix E = S − Σ(θ) (Heywood case).. 3.

(4) ˆ ,Λ ˆ , Bˆ , Γ,Φ, ˆ , 0, 0) and by θˆ the i-th term ˆ ˆ Ψ Let’s denote by σˆ ii the i-th term of the diagonal of Σ( Λ x y ii ˆ ⊕Θ ˆ . From the formula: of the diagonal of Θ ε. (9). δ. sii = σˆ ii + θîi. we may conclude that σˆ ii is the part of the variance sii of the i-th MV explained by its LV (except in a Heywood case) and θîi is the estimate of the variance of the measurement error relative to this MV. As all the error terms e = s − (σˆ + θˆ ) are null, this method is not oriented towards the ii. ii. ii. ii. research of parameters explaining the MV variances. It is in fact oriented towards the reconstruction of the covariances between the MV’s, variances excluded. The McDonald approach for parameter estimation. In his 96 paper, McDonald proposes to estimate the model parameters subject to the constraints that ˆ ,Λ ˆ , Bˆ , Γ,Φ, ˆ minimizing the criterion ˆ ˆ Ψ all the θîi are null. The object is to seek the parameters Λ x y (10). ˆ ,Λ ˆ , Bˆ , Γ,Φ, ˆ , 0, 0) ˆ ˆ Ψ S − Σ( Λ x y. 2. The estimations of the variances of the residual terms ε and δ are integrated in the diagonal terms of ˆ ,Λ ˆ , Bˆ , Γ,Φ, ˆ , 0, 0) . This method is therefore oriented ˆ ˆ Ψ the reconstruction error matrix E = S − Σ( Λ x y towards the reconstruction of the full MV covariance matrix, variances included. On a second step, ˆ and Θ ˆ of the variances of the residual terms ε and δ are obtained by using again final estimation Θ ε δ formula (9). Goodness of Fit. The quality of the fit can be measured by the GFI (Goodness of Fit Index) criterion of Jöreskog & Sorbum, defined by the formula (11). GFI = 1 −. i.e. the proportion of S. 2. ˆ ,Λ ˆ , Bˆ , Γ,Φ, ˆ ,Θ ˆ ,Θ ˆ ) ˆ ˆ Ψ S − Σ( Λ x y ε δ S. 2. 2. explained by the model. By convention, the model under study is. acceptable when the GFI is greater than 0.90. ˆ ,Λ ˆ , Bˆ , Γ,Φ, ˆ ,Θ ˆ ,Θ ˆ ) ˆ ˆ Ψ The quantity S − Σ( Λ x y ε δ. 2. can be deduced from the CMIN criterion given in. AMOS: (12). CMIN =. N −1 ˆ ,Λ ˆ , Bˆ , Γ,Φ, ˆ ,Θ ˆ ,Θ ˆ ) ˆ ˆ Ψ × S − Σ( Λ x y ε δ 2. where N is the number of cases.. 4. 2.

(5) In practical applications of the McDonald approach, the difference between the GFI given by AMOS and the exact GFI computed with formula (11) will be small: GFI = 1 −. (13). ˆ ,Λ ˆ , Bˆ , Γ,Φ, ˆ ,Θ ˆ ,Θ ˆ ) ˆ ˆ Ψ S − Σ( Λ x y ε δ S. 2. 2 2. = 1−. ˆ ,Λ ˆ , Bˆ , Γ,Φ, ˆ , 0, 0) − θˆ 2 ˆ ˆ Ψ S − Σ( Λ ∑ ii x y i. S. 2. Using the McDonald approach, the GFI given by AMOS is equal to GFI = 1 −. (14). and. ∑θˆ. 2 ii. / S. 2. ˆ ,Λ ˆ , Bˆ , Γ,Φ, ˆ , 0, 0) ˆ ˆ Ψ S − Σ( Λ x y S. 2. 2. is usually small. Furthermore, the exact GFI will always be larger than the GFI. i. given by AMOS. Evaluation of the latent variables. After having estimated the parameters of the model, we now present the problem of evaluating the latent variables. Three approaches can be distinguished: the traditional SEM approach, the "McDonald" approach, and the "Fornell" approach. As it is usual in the PLS approach, we now designate one manifest variable with the letter x and one latent variable with the letter ξ, regardless of whether they are of the dependent or independent type. The total number of latent variables is n = k + m and the number of manifest variables related to the latent variable ξj is pj. The traditional SEM approach To construct an estimation ξˆj of ξ j , one proceeds by multiple regression of ξ j on the whole set of the centred manifest variables x − x ,..., x − x . In other words, if one denotes as Σˆ the 11. 11. npn. npn. xx. implied (i.e. predicted by the structural model) covariance matrix between the manifest variables, and as Σˆ xξ j the vector of the implied covariances between the manifest variables x and the latent variable ξj, one obtains an expression of ξˆj as a function of the whole set of manifest variables: (15). ξˆ j = X Σˆ −xx1Σˆ xξ. j. where X = ⎡⎣ x11 − x11 ,..., xnpn − xnpn ⎤⎦ . This method is not really usable, as it is more natural to estimate a latent variable solely as a function of its own manifest variables.. 5.

(6) The "McDonald" approach for LV evaluation Let x j1 ,..., x jp j be the manifest variables relative to the latent variable ξj. McDonald (1996) proposes evaluating the latent variable ξj with the aid of the formula (16). ξˆj ∝ ∑ w jk ( x jk − x jk ) k. n ( x , ξ ) is the implied covariance between the MV xjk and the LV ξj and where ∝ where w jk = Cov jk j means that the left term is the standardized version of the right term. The regression coefficient λjk of the latent variable ξj in the regression of the manifest variable xjk on the latent variable ξj is estimated by (17). n ( x , ξ ) / Var m (ξ ) λˆ jk = Cov jk j j. From this, we deduce that formula (16) can also be written as (18). ξˆj ∝ ∑ λˆ jk ( x jk − x jk ) k. The McDonald approach thus amounts to estimating the latent variable ξj with the aid of the first PLS component computed in the PLS regression of the latent variable ξj on the manifest variables x j1 ,..., x jp j . This approach could enter into the PLS framework. In the usual PLS approach (Wold (1985), Tenenhaus, Esposito Vinzi, Chatelin & Lauro (2005)), under mode A, the outer weights are obtained by simple regression of each variable xjk on the inner estimate zj of the latent variable ξj.. It is necessary to calculate expressly the inner estimate zj of ξj to obtain these weights. Three procedures are proposed in PLS softwares: the centroid, factorial and structural schemes. The covariance-based SEM softwares, on the other hand, give directly the weights (loadings) that for each xjk represent an estimation of the regression coefficient of the "theoretical" latent variable ξj in the regression of xjk on ξj. Consequently, instead of the regression coefficient of the inner estimate zj, the estimated regression coefficient of the "theoretical" latent variable ξj can be used. We have proposed this procedure for calculating the weights based simply on the outputs of a covariancebased SEM software in Tenenhaus, Esposito Vinzi, Chatelin & Lauro (2005). We called it the "LISREL" scheme and, without knowing it, found the choice of weights proposed by McDonald. The "Fornell" approach When all the coefficients λˆ jk relative to a latent variable ξj have the same sign and the manifest variables are of a comparable order of magnitude, Fornell proposes building up a score taking into account the level of the manifest variables xjk: (19). ξˆj = ∑ k λˆ jk x jk / ∑ k λˆ jk. This approach is standard in customer satisfaction studies.. 6.

(7) Example 1. The following data have been collected by Jérome Pagès (ENSAR-INSFA, Rennes). Six orange juices were selected from the most well-known brands in France. Three products can be stored at room temperature (Joker, Pampryl and Tropicana all at room temperature (r.t.)) and three others have to be stored in refrigerated conditions (Fruivita, Pampryl and Tropicana all refrigerated (refr.)). Table 1 provides an extract of the data. The first nine variables correspond to the physico-chemical data, the following seven to sensorial assessments and the last 96 variables represent marks of appreciation of the product given by students at ENSA, Rennes. These figures have already been used in Tenenhaus, Pagès, Ambroisine, Guinot (2005) to illustrate the use of PLS regression and PLS path modeling on very small samples. In that paper, we have shown how to select a group of homogenous judges with respect to their preferences. Only this homogenous group of judges will be used in the present paper. Table 1: Extract from the orange juice data file PAMPRYL r.t. ________ Glucose Fructose Saccharose Sweetening power pH before processing pH after centrifugation Titer Citric acid Vitamin C Smell Intensity Odor typicity Pulp Taste intensity Acidity Bitterness Sweetness Judge 1 Judge 2 Judge 3 . . . Judge 96. TROPICANA r.t. _________. FRUIVITA refr. _________. JOKER r.t. ________. TROPICANA refr. _________. PAMPRYL refr. _________. 25.32 27.36 36.45 89.95 3.59 3.55 13.98 .84 43.44 2.82 2.53 1.66 3.46 3.15 2.97 2.60 2.00 1.00 2.00. 17.33 20.00 44.15 82.55 3.89 3.84 11.14 .67 32.70 2.76 2.82 1.91 3.23 2.55 2.08 3.32 2.00 3.00 3.00. 23.65 25.65 52.12 102.22 3.85 3.81 11.51 .69 37.00 2.83 2.88 4.00 3.45 2.42 1.76 3.38 3.00 3.00 4.00. 32.42 34.54 22.92 90.71 3.60 3.58 15.75 .95 36.60 2.76 2.59 1.66 3.37 3.05 2.56 2.80 2.00 2.00 2.00. 22.70 25.32 45.80 94.87 3.82 3.78 11.80 .71 39.50 3.20 3.02 3.69 3.12 2.33 1.97 3.34 4.00 4.00 3.00. 27.16 29.48 38.94 96.51 3.68 3.66 12.21 .74 27.00 3.07 2.73 3.34 3.54 3.31 2.63 2.90 3.00 1.00 1.00. 3.00. 3.00. 4.00. 2.00. 4.00. 1.00. PLS regression makes possible to link the block Y comprising hedonic data related to the homogenous block of judges to the block X comprising physico-chemical and sensorial data. One may, however, wish to take into account the fact that there are actually two blocks of variables explaining the block Y of hedonic data: block X1 comprising physico-chemical data and block X2 comprising sensorial data. Let us assume that the sensorial variables depend on the physico-chemical variables and that the hedonic variables in turn depend on the physico-chemical variables and the sensorial variables. We can then construct the arrow diagram shown in figure 1. The ULS-SEM approach described above and the PLS approach proposed by Herman Wold (Wold, 1985, Tenenhaus, Esposito Vinzi, Chatelin & Lauro (2005)) allow this type of model to be studied. In the arrow diagram shown in figure 1, we assume that each block of manifest variables is summarized by a latent variable. The relationship between the manifest variables (observable) and the latent variables (non-observable) may be formative, i.e. the function of the latent variable is to 7.

(8) summarize the manifest variables of the block. This relationship may also be reflective: each manifest variable is then a reflection of a latent variable existing a priori, a theoretical concept one would try to outline with measures. The formative mode does not require the blocks to be onedimensional, while that is compulsory for the reflective mode. Here, we are more in a formative mode for the physico-chemical and sensorial blocks and reflective mode by construction for the hedonic block. The two modes are indicated by the direction of the arrows in Figure 1. With regard to the PLS algorithm, it is recommended that the method of calculating outer estimates of the latent variables is selected depending on the type of relationship between the manifest variables and their latent variables: Mode A for the reflective type and Mode B for the formative type (Wold, 1985). The low number of products has obliged us to use Mode A to calculate the outer estimates of the latent variables (although the mode of relationship between the manifest and latent variables is formative for the physico-chemical and sensorial blocks). The ULS-SEM approach presented here is clearly oriented towards the reflective mode. Therefore this orange juice example will be analyzed with ULS-SEM and PLS approaches using the reflective mode for the three blocks. Concerning the physico-chemical and sensorial blocks, the direction of the arrows connecting the MV’s to their LV’s shown in Figure 1 should thus be inversed. Figure 1: Theoretical model of relationships between the hedonic, physico-chemical and sensorial data Glucose Fructose Saccharose Sweetening power pH before processing pH after centrifugation Titer Citric acid VitaminC Smell intensity Odor typicity Pulp Taste intensity Acidity Bitterness Sweetness. 1.. ξ1 ξ3. Judge 2 Judge 3. # Judge 96. ξ2. Use of ULS-SEM. We now use the ULS-SEM approach on the orange juice data. Following McDonald, the measurement error variances are put to 0. The results are given in Figure 2 and in Table 2. All manifest variables have been standardized. The value 1 has been given to the path coefficients related to the manifest variables pH before centrifugation, Sweetness and Judge2.. 8.

(9) Figure 2: AMOS 6.0 software output for the orange juice data 1 judge2. .00 e7. .00. Saccharose. 1. .89. .92. Sweetening power. .00. .22. 1.00 .96 .88. PHYSICO-CHEMICAL. 1.00. 1 e3. 1. pH after centrifuga.. .00. -.88 -.06. 1. e38. HEDONIC. 1. e14. .00. .00 .00. .00. Odor typicity 1. e12. e11. e10. .00. .64. .93. SENSORIAL. .82. Acidity. 1. judge68. e31. .00 e32. 1. judge84. e33. 1. judge86. Sweetness. e34. Judge92 1. judge96. 1. judge91. e35. e36 e39. 9. 1. .00. .00. .00. 1. judge79. 1.00. .00. e30. 1. Bitterness. e29. 1. judge77. 1 e8. e28. judge63. .70 1.06. 1 e9. 1. -.97. 1. .00. .00. e27. judge60. .81. -.95. e26. 1. .97 1.12. -.57 Taste intensity. 1. judge59. .75 .86 .57. 1. .00. e25. judge55. 1.07. .66. Pulp 1. .00. .24. 1. e13. .94. d1. .00. 1. .59. .30 Smell intensity. e24. judge52. 1.06 Vitamin C. .00. 1 judge48. .90. 1. .00 e23. 1.06. 1. .00. 1 judge35. d2. Citric acid. .00. e22. judge31. .75 .03. .78. e21. 1. .76. .22. Titer. e37. .00. -.87. 1 e1. .00. 1 judge30. 1.04. e2. .00. e20. judge25. .97. 1. .00. 1. 1.05. 1.00. .00. e19. judge12. pH before centrifuga.. .00. e18. judge11. 1. e4. .00. e17. judge6. -.76. 1. e5. 1. -.77. Fructose. .00. e16. judge3. 1. e6. .00. 1. Glucose. .00. .00. e15. 1. .00. .00. .00. .00. .00.

(10) Table 2: Outputs of AMOS 6.0 Table 2.1: Regression Weights (non significant weights in bold): Parameter SENSORIAL HEDONIC HEDONIC Glucose Fructose Saccharose Sweetening power pH before centrifug. pH after centrifug. Titer Citric acid Vitamin C Smell intensity Odor typicity Pulp Taste intensity Acidity Bitterness Sweetness Judge2 Judge3 Judge6 Judge11 Judge12 Judge25 Judge30 Judge31 Judge35 Judge48 Judge52 Judge55 Judge59 Judge60 Judge63 Judge68 Judge77 Judge79 Judge84 Judge86 Judge91 Judge92 Judge96. <--<--<--<--<--<--<--<--<--<--<--<--<--<--<--<--<--<--<--<--<--<--<--<--<--<--<--<--<--<--<--<--<--<--<--<--<--<--<--<--<--<---. PHYSICO-CHEMICAL PHYSICO-CHEMICAL SENSORIAL PHYSICO-CHEMICAL PHYSICO-CHEMICAL PHYSICO-CHEMICAL PHYSICO-CHEMICAL PHYSICO-CHEMICAL PHYSICO-CHEMICAL PHYSICO-CHEMICAL PHYSICO-CHEMICAL PHYSICO-CHEMICAL SENSORIAL SENSORIAL SENSORIAL SENSORIAL SENSORIAL SENSORIAL SENSORIAL HEDONIC HEDONIC HEDONIC HEDONIC HEDONIC HEDONIC HEDONIC HEDONIC HEDONIC HEDONIC HEDONIC HEDONIC HEDONIC HEDONIC HEDONIC HEDONIC HEDONIC HEDONIC HEDONIC HEDONIC HEDONIC HEDONIC HEDONIC. Estimate .784 .216 .643 -.765 -.764 .890 .219 1.000 .998 -.869 -.877 -.064 .244 .935 .657 -.565 -.946 -.974 1.000 1.000 .956 .879 1.051 .975 1.045 .758 .747 1.063 .896 1.060 .937 .593 1.069 .747 .855 .575 .975 1.120 .809 1.058 .702 .821. Lower (90%) .642 -.522 .216 -1.228 -1.224 .438 -.699 1.000 .926 -1.221 -1.225 -.703 -.503 .676 -.154 -1.416 -1.483 -1.154 1.000 1.000 .471 .178 .490 .482 .562 .000 .000 .639 .397 .546 .485 -.236 .669 .000 .000 -.222 .482 .594 .026 .541 -.075 .225. Upper (90%) 1.149 .672 1.653 -.292 -.287 1.353 .894 1.000 1.057 -.371 -.402 1.044 1.028 1.278 1.057 .012 -.719 -.774 1.000 1.000 1.703 1.803 1.984 1.480 2.342 2.226 1.173 1.483 1.264 2.015 1.354 1.418 1.989 1.173 2.165 1.927 1.480 1.994 1.675 1.772 1.900 1.562. P .010 .412 .064 .010 .010 .010 .799 ... .010 .020 .020 .818 .737 .010 .169 .112 .010 .010 ... ... .010 .018 .010 .010 .010 .070 .061 .010 .018 .010 .018 .286 .010 .061 .068 .316 .010 .020 .050 .010 .180 .040. Comments:. 1) Confidence intervals are computed by bootstrapping on 200 bootstrap samples. 2) The percentile 5% of the empirical distribution of the 200 bootstrap estimates is given in column “Lower” and the 95% one in column “Upper”. 3) P is the p-value in the following sense: the smallest interval containing the value 0 is obtained by using a confidence interval of order 1-P. 4) The regression weights for judges 59, 77 and 92 are not significant. They appear to be the least correlated to the first principal component of the hedonic data as it can be seen in Figure 3.. 10.

(11) Tableau 2.2: Variances Parameter PHYSICO-CHEMICAL d1 d2. Estimate .921 .298 .034. Lower .429 .028 .000. Upper 1.120 .364 .044. P .020 .010 .177. Tableau 2.3: Squared Multiple Correlations Parameter SENSORIAL HEDONIC. Estimate .655 .946. Lower .529 .919. Upper .951 1.000. P .010 .020. Tableau 2.4: Model Fit Summary Model Default model. NPAR 42. CMIN 105.613. Model Default model. RMR .175. GFI .904. AGFI .898. Comment: The GFI is equal to .904 and suggests that the model is acceptable.. Figure 3: Loading plot for the PCA of judges. 11. PGFI .855.

(12) The main objective of component-based SEM is the construction of scores. Following the McDonald approach, we use the path coefficients given in Figure 2 and in Table 2.1. We obtain the following constructs: For the Physico-chemical block Score(Physico-Chemical) ∝ -.765*Glucose -.764*Fructose +.890*Saccharose +.219*(Sweetening power) + 1*(pH before centrifugation) + .998*(pH after centrifugation) -.869*Titer -.877*(Citric acid) -.064*(Vitamin C). where all the variables (score and manifest variables) are standardized. For the Sensorial block Score(Sensorial) ∝ .244*(Smell intensity) + .935*(Odor typicity) + .657*Pulp -.565*(Taste intensity) - .946*Acidity -.974*Bitterness + 1*Sweetness. with the same standardization than for the previous block. For the Hedonic block Score(Hedonic) = 1*Judge2 + .956*Judge3 + … + .821*Judge96. with the same standardization than for the previous blocks. The latent variable scores are given in Table 2.5 and their correlations in Table 2.6. The correlations between these scores and the manifest variables are given in Table 2.7. Tableau 2.5: ULS-SEM latent variable scores. Pampryl r.t. Tropicana r.t. Fruivita refr. Joker r.t. Tropicana refr. Pampryl refr.. Physico-chemical -0.72 1.05 0.81 -1.54 0.56 -0.16. Sensorial -1.26 0.43 0.87 -0.77 1.27 -0.53. Hedonic -1.10 0.66 1.17 -0.84 0.85 -0.74. Tableau 2.6: ULS-SEM latent variable score correlation matrix Physico-chemical. Sensorial. Hedonic. Physico-chemical. 1.000. .810. .867. Sensorial. .810. 1.000. .961. Hedonic. .867. .961. 1.000. 12.

(13) Tableau 2.7: Correlations between the ULS-SEM LV scores and the MV’s. Glucose Fructose Saccharose Sweetening power pH before centrifugation pH after centrifugation Titer Citric acid Vitamin C Smell intensity Odor typicity Pulp Taste intensity Acidity Bitterness Sweetness Judge2 Judge3 Judge6 Judge11 Judge12 Judge25 Judge30 Judge31 Judge35 Judge48 Judge52 Judge55 Judge59 Judge60 Judge63 Judge68 Judge77 Judge79 Judge84 Judge86 Judge91 Judge92 Judge96. Physico-chemical -0.898 -0.898 0.926 0.078 0.950 0.939 -0.973 -0.977 -0.195 0.229 0.806 0.558 -0.401 -0.745 -0.775 0.871 0.640 0.647 0.656 0.872 0.718 0.971 0.742 0.343 0.771 0.460 0.791 0.504 0.534 0.870 0.343 0.909 0.734 0.718 0.953 0.453 0.827 0.724 0.554. Sensorial -0.585 -0.575 0.755 0.288 0.896 0.904 -0.735 -0.740 -0.040 0.410 0.976 0.704 -0.646 -0.927 -0.951 0.967 0.928 0.756 0.662 0.785 0.929 0.817 0.518 0.693 0.936 0.837 0.840 0.878 0.592 0.854 0.693 0.670 0.473 0.929 0.934 0.685 0.845 0.419 0.679. Hedonic -0.673 -0.673 0.817 0.242 0.947 0.946 -0.765 -0.774 -0.001 0.174 0.893 0.625 -0.552 -0.950 -0.976 0.979 0.887 0.877 0.794 0.919 0.823 0.864 0.637 0.712 0.926 0.834 0.944 0.863 0.458 0.924 0.712 0.666 0.396 0.823 0.941 0.762 0.927 0.595 0.744. The estimate of the hedonic score, shown in Table 2.5, enables us to classify the products by order of preference: Fruivita refr. > Tropicana refr. > Tropicana r.t. > Pampryl refr. > Joker r.t. > Pampryl r.t. Using the significant regression weights of Table 2.1 and the correlations given in Table 2.7, we may conclude that the physico-chemical score is correlated negatively with the fructose, glucose, titer and citric acid characteristics and positively with the saccharose, pH before and after centrifugation. 13.

(14) characteristics. The sensorial score is correlated positively with odor typicity and sweetness and negatively with acidity and bitterness. The hedonic score related to the homogenous group of judges is correlated positively with the physico-chemical (.867) and sensorial scores (.961). Consequently, this group of judges likes products with odor typicity and sweetness (Fruivita refr., Tropicana r.t., Tropicana refr.) and rejects products with an acidic and bitter nature (Joker r.t., Pampryl refr., Pampryl r.t.). This result is verified in Table 3. Table 3: Sensorial characteristics of the products ranked according to the hedonic score odor hedonic Product sweeteness typicity acidity bitterness score _____________________________________________________________________ Fruivita refr. 3.4 2.88 2.42 1.76 1.17 Tropicana refr. 3.3 3.02 2.33 1.97 0.85 Tropicana r.t. 3.3 2.82 2.55 2.08 0.66 --------------------------------------------------------------------Pampryl refr. 2.9 2.73 3.31 2.63 -0.74 Joker r.t. 2.8 2.59 3.05 2.56 -0.84 Pampryl r.t. 2.6 2.53 3.15 2.97 -1.10. 2.. Use of PLS Path modeling. For estimating the parameters of the model, we have used the module XLSTAT-PLSPM of the XLSTAT software (XLSTAT, 2007). The variables have all been standardized. To calculate the inner estimates of the latent variables, we have used the centroid scheme recommended by Herman Wold (1985). Table 4 contains the output of this modelling of the orange juice data with comments. Figure 4 includes the regression coefficients between the latent variables of the model shown in Figure 1 and the correlation coefficients between the manifest and latent variables. Coefficient validation. Although it gives robust and stable results with the various methods used on these orange juice data (the same items appear to be significant in Tenenhaus, Pagès, Ambroisine, Guinot (2005) and in the present paper), we may think that bootstrap validation carried out on only 6 cases cannot be very reliable. One reason is the following: In this example, data structure comes from the opposition between the two groups {Fruivita refr., Tropicana r.t., Tropicana refr.} on one side and {Pampryl refr., Joker r.t., Pampryl r.t.} on the other side. If one of these groups of products is not selected in the bootstrap sampling selection, then the correlations between the latent variables disappear. Maybe, non representative samples should be eliminated. Bootstrap has been based on 200 samples and 90% confidence intervals have been asked for. Results of bootstrap validation for the inner model are shown in Table 3.7. The confidence intervals indicate the regression coefficients which are significant. We can also look to the usual Student t related to the regression coefficients. By convention, a coefficient is significant if the absolute value of t is larger than 2. In this specific example, both methods give the same results. The relationship between the hedonic data and the physico-chemical data is not significant (t = 1.522), while that between the hedonic data and the sensorial data is (t = 3.546). There is also a significant connection between the physico-chemical and the sensorial data (t = 2.864).. 14.

(15) Figure 4: XLSTAT-PLSPM software output for the orange juice data. 15.

(16) However, the strong correlation between the hedonic data and the physico-chemical data suggests that a PLS regression of the hedonic score should be carried out on the physico-chemical and sensorial scores. This PLS regression (with one component) leads to the following equation: Hedonic score = 0.49*(Physico-Chemical score) + 0.53*(Sensorial score) 2. with an R = 0.948 to be compared with R2 = 0.960 in the model shown in Figure 4. Bootstrap validation for PLS regression yields to the same significant regression coefficients as for OLS regression (see Table 4). If the PLS regression is validated by Jack-knife on the observed latent variables, both coefficients are now significant (Table 4 and Figure 5). Figure 5: XLSTAT-PLSPM software output: Validation of the PLS regression of the hedonic score on the physico-chemical and sensorial scores Hedonic / Standardized coefficients (95% conf. interval) 0.7. Standardized coefficients. 0.6 Senso rial 0.5. P hysico -chemical. 0.4. 0.3. 0.2. 0.1. 0. Variable. Table 3: XLSTAT-PLSPM outputs for the orange juice example Table 3.1: Block dimensionality Latent variable Physico-chemical. Dimensions 9. Critical value 1.800. Sensorial. 7. 1.400. Hedonic. 23. 4.600. Eigenvalues 6.213 1.410 1.046 0.317 0.013 4.744 1.333 0.820 0.084 0.019 14.655 3.663 2.199 1.837 0.646. Comment: The critical value is equal to the average eigenvalue. In this example the number of eigenvalues is equal to 5 as the number of observations (6) is smaller than the number of variables and because the variables are centered. Each block can be considered as unidimensional.. 16.

(17) Table 3.2: Checking block dimensionality (larger correlation per row is in bold) Variables/Factors correlations (Physico-chemical):. Glucose Fructose Saccharose Sweetening power pH before centrifugation pH after centrifugation Titer Citric acid.. F1 0.914 0.913 -0.912 -0.035 -0.945 -0.933 0.974 0.978. F2 0.388 0.378 0.261 0.947 0.019 0.071 -0.144 -0.136. Vitamin C. 0.212. -0.328. F3 -0.057 -0.083 0.286 0.319 -0.026 -0.069 0.070 0.049 0.916. F4 0.109 0.121 -0.127 -0.020 0.325 0.346 0.150 0.144. F5 -0.013 -0.050 0.049 0.017 0.006 -0.006 0.062 0.052. 0.080. -0.035. Variables/Factors correlations (Sensorial):. Smell intensity Odor typicity Pulp Taste intensity Acidity Bitterness Sweetness. F1. F2. F3. F4. F5. 0.460 0.985 0.722 -0.650 -0.913 -0.935 0.955. 0.754 0.134 0.617 0.429 0.348 0.188 -0.159. -0.468 -0.058 0.298 0.626 -0.021 -0.285 0.187. 0.008 0.077 -0.096 0.005 0.205 -0.028 0.161. 0.004 0.041 -0.031 0.048 -0.057 0.093 0.048. Variables/Factors correlations (Hedonic):. judge2 judge3 judge6 judge11 judge12 judge25 judge30 judge31 judge35 judge48 judge52 judge55 judge59 judge60 judge63 judge68 judge77 judge79 judge84 judge86 judge91 judge92 judge96. F1. F2. F3. F4. F5. 0.894 0.890 0.798 0.919 0.814 0.849 0.625 0.733 0.925 0.852 0.948 0.878 0.438 0.922 0.733 0.638 0.363 0.814 0.928 0.778 0.927 0.585 0.755. -0.218 -0.318 -0.039 0.051 0.221 0.422 0.399 -0.565 0.035 -0.474 -0.049 -0.398 0.475 0.051 -0.565 0.763 0.862 0.221 0.352 -0.410 0.043 0.348. 0.307 -0.310 -0.166 -0.177 0.213 -0.177 -0.631 0.321 0.329 0.207 -0.286 0.131 0.740 -0.149 0.321 -0.063 0.235 0.213 0.115 -0.372 0.069 -0.282. -0.203 -0.018 0.522 0.278 -0.429 -0.166 -0.058 0.179 0.179 -0.039 0.068 -0.164 0.176 -0.154 0.179 -0.072 -0.169 -0.429 0.032 -0.229 0.364 0.676. 0.132 0.104 -0.247 0.212 -0.243 0.205 -0.221 0.090 -0.060 -0.074 -0.115 -0.166 -0.062 0.317 0.090 -0.041 0.203 -0.243 0.012 -0.187 0.043 0.000. -0.285. -0.331. -0.430. 0.231. 17.

(18) Table 3.3: Model validation Goodness of fit index: GoF. GoF (Bootstrap). Standard error. Critical ratio (CR). Absolute Relative Outer model. 0.731 0.823 0.911. 0.732 0.801 0.852. 0.049 0.048 0.039. 14.943 17.146 23.286. Inner model. 0.903. 0.940. 0.048. 18.815. Lower bound (90%). Upper bound (90%). Minimum. 1st Quartile. Median. 3rd Quartile. Maximum. Absolute Relative Outer model. 0.645 0.707 0.784. 0.799 0.847 0.893. 0.522 0.525 0.707. 0.711 0.790 0.816. 0.738 0.811 0.863. 0.762 0.824 0.865. 0.821 0.855 0.911. Inner model. 0.885. 0.999. 0.669. 0.921. 0.941. 0.966. 1.000. Comment: Number of bootstrap samples = 200. Level of the confidence intervals: 90%. Absolute Goodness-of-Fit 1 1 Cor 2 ( x jh , ξ j ) × R 2 (ξ j ; ξi explaining ξ j ) ∑∑ ∑ Nb of endogenous LV Endogenous VL J j h. GoF =. where J = ∑ p j j. Relative Goodness-of-Fit. pj. Cor 2 ( x jh , ξ j ) ∑ R 2 (ξ j ; ξi explaining ξ j ) 1 1 h =1 × ∑ ∑ J j Nb of endogenous LV endogenous LV λj ρ j2 . . Outer model. where: -. Inner model. λj is the first eigenvalue computed from the PCA of block j ρ j is the first canonical correlation between the dependent block j and the concatenation of all the blocks i explaining the dependent block j.. 18.

(19) Table 3.4: Latent Variable validation Cross-loadings (Monofactorial manifest variables):. Glucose Fructose Saccharose Sweetening power pH before centrifugation pH after centrifugation Titer Citric acid. Vitamin C Smell intensity Odor typicity Pulp Taste intensity Acidity Bitterness Sweetness judge2 judge3 judge6 judge11 judge12 judge25 judge30 judge31 judge35 judge48 judge52 judge55 judge59 judge60 judge63 judge68 judge77 judge79 judge84 judge86 judge91 judge92 judge96. Comment: -. Physicochemical -0.889 -0.889 0.931 0.099 0.952 0.942 -0.972 -0.977 -0.194 0.236 0.814 0.574 -0.397 -0.751 -0.784 0.877 0.646 0.654 0.665 0.873 0.729 0.972 0.750 0.349 0.777 0.470 0.801 0.517 0.533 0.872 0.349 0.910 0.727 0.729 0.957 0.467 0.831 0.724 0.559. Sensorial -0.584 -0.574 0.758 0.294 0.896 0.905 -0.738 -0.743 -0.045 0.411 0.977 0.709 -0.639 -0.925 -0.952 0.968 0.925 0.755 0.667 0.785 0.930 0.817 0.524 0.690 0.936 0.835 0.843 0.876 0.593 0.853 0.690 0.673 0.474 0.930 0.935 0.685 0.846 0.424 0.677. Hedonic -0.689 -0.689 0.832 0.242 0.955 0.954 -0.789 -0.798 -0.023 0.199 0.904 0.637 -0.549 -0.942 -0.972 0.982 0.880 0.860 0.787 0.916 0.834 0.879 0.648 0.689 0.926 0.815 0.938 0.847 0.479 0.924 0.689 0.695 0.432 0.834 0.953 0.742 0.925 0.602 0.731. Sweetening power and Vitamin C are not correlated to their block. Smell intensity is not correlated to its own block Judges 59 and 77 are weakly correlated to their block.. 19.

(20) Table 3.5: Latent Variable weights (non significant weights are in bold). Latent variable. Physico-chemical. Sensorial. Hedonic. Comment: -. Manifest variables. Outer weight. Outer weight (Bootstrap). Standard error. Critical ratio (CR). Lower bound (90%). Upper bound (90%). Glucose Fructose Saccharose Sweetening power pH before centrifugation pH after centrifugation Titer Citric acid. Vitamin C Smell intensity Odor typicity Pulp Taste intensity Acidity Bitterness Sweetness judge2 judge3 judge6 judge11 judge12 judge25 judge30 judge31 judge35 judge48 judge52 judge55 judge59 judge60 judge63 judge68 judge77 judge79 judge84 judge86 judge91 judge92. -0.124 -0.123 0.154 0.052 0.180 0.180 -0.148 -0.150 -0.007 0.052 0.206 0.145 -0.113 -0.203 -0.210 0.223 0.059 0.053 0.050 0.062 0.062 0.067 0.048 0.039 0.064 0.049 0.062 0.052 0.042 0.065 0.039 0.059 0.045 0.062 0.071 0.043 0.063 0.043. -0.113 -0.111 0.140 0.038 0.159 0.159 -0.137 -0.139 -0.010 0.033 0.190 0.114 -0.107 -0.190 -0.197 0.208 0.056 0.051 0.047 0.059 0.058 0.063 0.040 0.036 0.062 0.047 0.061 0.049 0.036 0.060 0.036 0.051 0.037 0.058 0.066 0.039 0.059 0.038. 0.050 0.049 0.041 0.093 0.021 0.020 0.017 0.016 0.085 0.099 0.039 0.082 0.080 0.045 0.034 0.036 0.012 0.015 0.018 0.019 0.013 0.013 0.020 0.023 0.012 0.016 0.012 0.014 0.026 0.016 0.023 0.018 0.027 0.013 0.013 0.020 0.018 0.028. -2.491 -2.498 3.720 0.560 8.460 9.073 -8.808 -9.127 -0.077 0.527 5.255 1.759 -1.406 -4.472 -6.136 6.148 4.740 3.435 2.749 3.268 4.605 5.247 2.364 1.695 5.530 3.047 5.320 3.717 1.642 4.065 1.695 3.305 1.649 4.605 5.420 2.168 3.499 1.554. -0.147 -0.144 0.096 -0.115 0.121 0.121 -0.169 -0.171 -0.126 -0.136 0.143 -0.065 -0.196 -0.227 -0.240 0.143 0.043 0.034 0.023 0.048 0.037 0.051 0.000 0.000 0.050 0.020 0.049 0.029 -0.021 0.047 0.000 0.000 -0.021 0.037 0.055 -0.007 0.050 -0.007. -0.057 -0.057 0.180 0.151 0.184 0.185 -0.111 -0.113 0.130 0.166 0.241 0.207 0.069 -0.143 -0.143 0.267 0.064 0.065 0.071 0.080 0.083 0.078 0.065 0.066 0.075 0.064 0.082 0.061 0.066 0.074 0.066 0.072 0.065 0.083 0.084 0.057 0.074 0.081. judge96. 0.046. 0.044. 0.020. 2.345. 0.013. 0.066. Sweetening power and Vitamin C are not significant in block physico-chemical Smell intensity, Pulp and Taste intensity are not significant in block sensorial Judges 59, 77, 86 and 92 are not significant in block hedonic.. 20.

(21) Table 3.6: Correlations between MV and LV Correlations:. Latent variable. Physico-chemical. Sensorial. Hedonic. Comments: -. Manifest variables. Standardized loadings. Communalities. Redundancies. Standardized loadings (Bootstrap). Standard error. Glucose Fructose Saccharose Sweetening power pH before centrifugation pH after centrifugation Titer Citric acid. Vitamin C Smell intensity Odor typicity Pulp Taste intensity Acidity Bitterness Sweetness judge2 judge3 judge6 judge11 judge12 judge25 judge30 judge31 judge35 judge48 judge52 judge55 judge59 judge60 judge63 judge68 judge77 judge79 judge84 judge86 judge91 judge92. -0.889 -0.889 0.931 0.099. 0.790 0.790 0.867 0.010. -0.850 -0.847 0.876 0.101. 0.226 0.227 0.268 0.591. 0.952 0.942 -0.972 -0.977 -0.194 0.411 0.977 0.709 -0.639 -0.925 -0.952 0.968 0.880 0.860 0.787 0.916 0.834 0.879 0.648 0.689 0.926 0.815 0.938 0.847 0.479 0.924 0.689 0.695 0.432 0.834 0.953 0.742 0.925 0.602. 0.906 0.887 0.946 0.954 0.038 0.169 0.954 0.503 0.408 0.856 0.907 0.936 0.774 0.740 0.619 0.840 0.695 0.773 0.419 0.475 0.858 0.664 0.879 0.717 0.230 0.853 0.475 0.483 0.186 0.695 0.909 0.551 0.856 0.363. 0.113 0.641 0.338 0.274 0.575 0.609 0.629 0.743 0.710 0.594 0.806 0.667 0.742 0.403 0.456 0.824 0.638 0.844 0.688 0.221 0.820 0.456 0.464 0.179 0.667 0.873 0.529 0.822 0.348. 0.968 0.964 -0.950 -0.956 -0.203 0.285 0.940 0.612 -0.589 -0.915 -0.949 0.967 0.859 0.828 0.736 0.885 0.852 0.896 0.570 0.619 0.923 0.785 0.936 0.809 0.447 0.893 0.619 0.685 0.426 0.852 0.943 0.670 0.895 0.531. 0.078 0.073 0.066 0.063 0.435 0.497 0.110 0.382 0.404 0.206 0.087 0.069 0.182 0.236 0.254 0.230 0.115 0.159 0.284 0.340 0.147 0.233 0.130 0.215 0.406 0.214 0.340 0.258 0.404 0.115 0.137 0.308 0.215 0.411. judge96. 0.731. 0.535. 0.514. 0.709. 0.281. Standardized loading = correlation Communality = squared correlation Redundancy = Communality*R2(Dep. LV; Explanatory related LVs). 21.

(22) Table 3.6: Correlations between MV and LV (continued) Correlations:. Latent variable. Manifest variables. Glucose Fructose Saccharose Sweetening power Physico-chemical pH before centrifugation pH after centrifugation Titer Citric acid. Vitamin C Smell intensity Odor typicity Pulp Sensorial Taste intensity Acidity Bitterness Sweetness judge2 judge3 judge6 judge11 judge12 judge25 judge30 judge31 judge35 judge48 judge52 Hedonic judge55 judge59 judge60 judge63 judge68 judge77 judge79 judge84 judge86 judge91 judge92 judge96. Critical ratio (CR). Lower bound (90%). Upper bound (90%). -3.938 -3.913 3.476 0.167 12.199 12.966 -14.729 -15.413 -0.445 0.827 8.912 1.856 -1.580 -4.495 -10.967 13.953 4.832 3.642 3.102 3.993 7.258 5.541 2.277 2.030 6.293 3.493 7.234 3.942 1.179 4.315 2.030 2.694 1.069 7.258 6.934 2.411 4.301 1.465. -0.995 -0.998 0.729 -0.860 0.925 0.899 -1.000 -1.000 -0.970 -0.684 0.763 -0.174 -0.998 -1.000 -1.000 0.940 0.639 0.469 0.347 0.773 0.648 0.742 0.000 0.000 0.825 0.408 0.858 0.425 -0.426 0.716 0.000 0.000 -0.426 0.648 0.911 -0.093 0.783 -0.192. -0.569 -0.589 0.996 0.950 1.000 1.000 -0.860 -0.872 0.656 0.930 0.999 0.998 0.203 -0.752 -0.897 1.000 0.981 0.999 0.982 0.994 0.999 0.997 0.936 0.991 0.997 0.997 0.998 0.996 0.948 0.997 0.991 0.997 0.896 0.999 0.998 0.992 0.986 0.982. 2.602. 0.173. 0.997. Comment: (identical with those for weights) Sweetening power and Vitamin C are not significant in block physico-chemical Smell intensity, Pulp and Taste intensity are not significant in block sensorial Judges 59, 77, 86 and 92 are not significant in block hedonic.. 22.

(23) Table 3.7: Inner model R² (Sensorial):. R² 0.672. R²(Bootstrap) 0.791. Standard error 0.157. Critical ratio (CR) 4.276. Lower bound (90%) 0.588. Upper bound (90%) 0.999. Path coefficients (Sensorial): Latent variable Physico-chimique. Latent variable. Value. Standard error. t. Pr > |t|. Value(Bootstrap). 0.820. 0.286. 2.864. 0.046. 0.835. Standard error(Bootstrap). Critical ratio (CR). Lower bound (90%). Upper bound (90%). 0.308. 2.660. 0.757. 0.994. Physico-chemical. Comment:. The usual Student t test and the bootstrap approach give here the same results.. R² (Hedonic):. R² 0.960. R²(Bootstrap) 0.986. Standard error 0.017. Critical ratio (CR) Lower bound (90%) Upper bound (90%) 58.017 0.947 1.000. Path coefficients (Hedonic): Latent variable. Value. Standard error. t. Pr > |t|. Physico-chemical. 0.306. 0.201. 1.522. 0.225. Sensorial. 0.713. 0.201. 3.546. 0.038. Path coefficients (Hedonic):. Latent variable Physico-chemical Sensorial. Value(Bootstrap). Standard error(Bootstrap). Critical ratio (CR). Lower bound (90%). Upper bound (90%). 0.331 0.651. 0.698 0.674. 0.438 1.058. -0.642 0.000. 1.000 1.397. Comment:. The usual Student t test and the bootstrap approach give here the same results. But the nonsignificance of the physico-chemical can also be due to a multicolinearity problem. PLS regression for estimating the structural regression equations can be used and is presented in Table 4.. 23.

(24) Table 3.8: Impact and contribution of the variables to Hedonic Impact and contribution of the variables to Hedonic: Sensorial 0.964 0.713 0.688 71.612 71.612. Correlation Path coefficient Correlation * path coefficient Contribution to R² (%) Cumulative %. Physico-chemical 0.891 0.306 0.273 28.388 100.000. Impact and contribution of the variables to Hedonic 0.8. 100. 80. Path coefficients. 0.6 0.5. 60. 0.4 40. 0.3 0.2. 20. Contribution to R² (%). 0.7. 0.1 0. 0 Senso rial. P hysico -chemical. Latent variable Path coefficient. Cumulative %. Comment:. -. R 2 (Y ; X 1 ,..., X k ) = ∑ Cor (Y , X j ) * βˆ j When all the terms Cor (Y , X ) * βˆ are positive, it makes sense to compute the j. j. relative contribution of each explanatory variable Xj to the R square.. 24.

(25) Table 3.9: Model assessment. Latent variable. Type. Physico-chemical Sensorial Hedonic. Exogenous Endogenous Endogenous. Mean. R². Adjusted R². 0.000 0.000 0.000. 0.672 0.960. 0.672 0.950. Mean. Weighted Mean Communalities (AVE). Mean Redundancies. 0.687 0.676 0.634. 0.454 0.609. 0.654. 0.532. 0.816. Comments:. -. The weighted mean takes into account the number of MVs in each block. (Absolute GoF)2 = (Mean R2)*(Weighted mean communalities). Table 3.10: Correlation between the latent variables Correlations (Latent variable): Physico-chemical. Sensorial. Hedonic. Physico-chemical Sensorial. 1.000 0.820. 0.820 1.000. 0.891 0.964. Hedonic. 0.891. 0.964. 1.000. Table 3.11: Direct, indirect and total effects Direct effects (Latent variable): Physico-chemical Physico-chemical Sensorial. 0.820. Hedonic. 0.306. Sensorial. Hedonic. 0.713. Comment :. -. Sensorial = .820*Physico-chemical Hedonic = .306*Physico-chemical + .713*Sensorial. 25.

(26) Table 3.11: Direct, indirect and total effects (continued) Indirect effects (Latent variable): Physico-chemical Physico-chemical Sensorial. 0.000. Hedonic. 0.585. Sensorial. Hedonic. 0.000. Comment :. -. Hedonic = .306*Physico-chemical + .713*.820*Physico-chemical Indirect effect of Physico-chemical on Hedonic = .713*.820 = 0.585. Total effects (Latent variable):. Physico-chemical Sensorial Hedonic. Physico-chemical. Sensorial. 0.820 0.891. 0.713. Hedonic. Comment :. Hedonic = .306*Physico-chemical + .713*.820*Physico-chemical = .891*Physico-chemical. Table 3.12: Discriminant validity Discriminant validity (Squared correlations < AVE) :. Physico-chemical Sensorial Hedonic Mean Communalities (AVE). Physico-chemical 1 0.672 0.793. Sensorial 0.672 1 0.930. Hedonic 0.793 0.930 1. 0.687. 0.676. 0.634. Comment: Due to non significant MV’s, the AVE criterion is too small for the three LV’s.. 26.

(27) Table 3.13: Latent variable score Summary statistics / Latent variable scores:. Observations. Minimum. Maximum. Mean. Std. Deviation. Physico-chemical Sensorial. 6 6. -1.680 -1.381. 1.120 1.378. 0.000 0.000. 1.000 1.000. Hedonic. 6. -1.203. 1.253. 0.000. 1.000. Variable. Latent variable scores : Physico-chemical. Sensorial. Hedonic. -0.810 1.120 0.917 -1.680 0.630 -0.176. -1.381 0.462 0.964 -0.852 1.378 -0.570. -1.203 0.742 1.253 -0.991 0.946 -0.747. pampryl r. t. tropicana r. t. fruvita refr. joker r. t. tropicana refr. pampryl refr.. Table 4: PLS regression of Hedonic score on Physico-chemical and sensorial scores Goodness of fit statistics (Variable Hedonic): R². 0.948. Bootstrap validation Path coefficients (Hedonic):. Latent variable. Value. Value (Bootstrap). Standard error (Bootstrap). Critical ratio (CR). Lower bound (90%). Upper bound (90%). Physico-chemical Sensorial. 0.490 0.531. 0.267 0.744. 0.408 0.402. 1.201 1.320. -0.422 0.103. 0.893 1.397. Jack-knife validation on the observed latent variables Standardized coefficients (Variable Hedonic):. Variable Physico-chemical Sensorial. Coefficient 0.490. Std. deviation 0.021. Lower bound (95%) 0.449. Upper bound (95%) 0.531. 0.531. 0.022. 0.488. 0.573. 27.

(28) 3.. Comparison between PLS, ULS-SEM and PCA. Comparison between weights. When we compare the weight confidence intervals computed with PLS (Table 3.5) with those coming from ULS-SEM (Table 2.1), we find that both methods yield to the same non significant weights with only one exception for Judge 86 (non significant for PLS and significant for ULSSEM). These weights are compared in Figure 6. Figure 6: Comparison between the PLS and ULS-SEM weights. 28.

(29) Comparison between PLS and ULS-SEM scores. The scores coming from PLS and ULS-SEM are compared in Figure 7. They are highly correlated. This confirms our previous findings and a general remark of Noonan and Wold (1982) on the fact that the final outer LV estimates depend very little on the selected scheme of calculation of the inner LV estimates. Figure 7: Comparison between the PLS and ULS-SEM scores. 29.

(30) Comparison between the PLS and ULS-SEM scores and the block principal components. The correlations between the PLS and USL-SEM scores with the block principal components are given in Table 5. Table 5: Correlation between the PLS and ULS-SEM scores and the block principal components. ULS-SEM scores .999 .998 .999. st. Physico-chemical 1 PC Sensorial 1st PC Hedonic 1st PC. PLS scores .997 .998 .997. We may conclude that ULS-SEM, PLS and principal component analysis are giving practically the same scores on this orange juice example. II.. Exploratory factor analysis, ULS-SEM and PCA. If the structural model is limited to one standardized latent variable (or common factor) ξ described by a vector x composed of p centred manifest variables, one gets the decomposition (20). x = λ xξ + δ. It is usual to add the following hypotheses:. (21). E (δ) = 0, Θδ = Cov(δ) = E ( δδ ') is diagonal Cov(ξ , δ) = 0. Under these hypotheses, the covariance matrix Σ of the random vector x is written as (22). Σ = E (xx ') = λ x λ 'x + Θδ. The parameters λx and Θδ in model (22) can now be estimated using the ULS method. This means ˆ , minimizing the criterion searching for the parameters λˆ x and Θ δ (23). ˆ ) S − (λˆ x λˆ 'x + Θ δ. 2. where S is the matrix of empirical covariances. To remove the indetermination on the global sign of the vector λˆ x (if λˆ x is a solution, then - λˆ x is also a solution), the solution can be chosen to make the sum of the coordinates positive. This is the option chosen in AMOS 6.0. The advantage of the ULS method over the other more frequently used GLS (Generalized Least Squares) or ML (Maximum Likelihood) methods lies in its ability to function with a singular covariance matrix S, particularly in situations where the number of observations is less than the number of variables.. 30.

(31) The quality of the fit is measured by the GFI written here as. GFI = 1 −. (24). 2. ˆ ) S − (λˆ x λˆ 'x + Θ δ. S. 2. Principal component analysis (PCA) is found again if one imposes the additional condition ˆ =0 Θ δ. (25). In this case, one seeks to minimise the criterion S − λˆ x λˆ 'x. (26). The vector λˆ x is now equal to. 2. λ1 u1 , where u1 is the normed eigenvector of the covariance matrix. S associated with the largest eigenvalue λ1. For each MV xj, the explained variance (or communality) is therefore σˆ jj = λ1u12j . The residual variance (or specificity) θj is then estimated by θˆ j = s jj − λ1u12j . The quality of the fit can still be measured by the GFI: 2 S - λˆ x λˆ 'x − ∑ ( s jj − σˆ jj ) 2. GFI = 1 −. (27). S. j 2. The square of the norm of S is equal to the sum of the squared eigenvalues λh of S. In PCA, 2 S - λˆ λˆ ' is equal to the sum of the squares of the p-1 last eigenvalues of S. Consequently, in PCA x. x. one obtains. (28). GFI =. λ12 + ∑ ( s jj − λ1u12j ). 2. j. p. ∑λ h =1. 2 h. Moreover, SEM softwares allow the computation of confidence intervals for parameters by bootstrapping. They also allow criterion (26) to be minimised by imposing value constraints or equality constraints on the coordinates of the vector λˆ x . We can continue to use criterion (27) to measure the quality of the model.. 31.

(32) Link between ULS-SEM, Factor Analysis, PLS and Principal Component Analysis. A central point in PLS path modelling concerns the relation between the MV’s related to one LV and this LV. Reflective mode The reflective mode is common to PLS and SEM. In this mode, each MV is related to its LV by a simple regression: (29). x j = λ jξ + δ j. This model corresponds to the usual one-dimension factor analysis (FA) model. Minimization of criterion (23) allows the estimation of the parameters of this model. As the diagonal terms of the ˆ ) are automatically null, the path coefficient λj are computed with the residual matrix S − (λˆ x λˆ 'x + Θ δ objective of reconstruction of the covariance matrix terms outside the diagonal. The average variance extracted (AVE), defined by ∑ σˆ jj / ∑ s jj , measures the summary power of the LV. It is j. j. not the first objective in this approach. It is an a posteriori value of the model. In a one block of variables situation, it is natural to estimate the LV ξ using the first principal component of the MV’s. The minimization of criterion (26) yields to this solution. Furthermore, the diagonal terms of the residual S − λˆ x λˆ 'x are now taken into account in the minimization. The path coefficients λj are now computed with the objective of reconstruction of the whole covariance matrix terms, diagonal included. The AVE still measures the summary power of the LV. But it is now a part of the objective in this approach. Consequently, in the ULS-SEM context, PCA can be obtained by considering the FA model (22) and then by cancelling in a first step the residual measurement variances. In PLS path modelling softwares, the one block situation has been implemented. In this situation, the outer estimate of the block LV is also taken as the inner estimate. Therefore, Mode A yields to the following equation: (30). ξˆ ∝ ∑ Cov( x j , ξˆ) x j j. The PLS algorithm will converge to the first principal component of the block of MV’s, solution of equation (30). Formative mode. The formative mode is easy to implement in PLS. In this mode, each LV is related to its MV’s by a multiple regression: (31). ξ = ∑ β jxj +δ j. But in a one block situation, it is an indeterminate problem.. 32.

(33) Conclusion. The residual sum of squares (RESS), defined by. ∑(s. ij. i< j. − σˆ ij ) , is smaller for FA than for PCA. On 2. the other hand the AVE is larger for PCA than for FA. Example 2. We use data on the cubic capacity, power output, speed, weight, width and length of 24 car models in production in 2004 given in Tenenhaus (2007). We compare on these data FA and PCA with respect to the RESS and AVE criterions. The analyses are carried out on standardized variables. The correlation matrix is given in Table 6. Table 6: Car example: Correlation matrix Capacity Power Speed Weight Width Length. Capacity Power 1 0.954 1. Speed Weight Width Length 0.885 0.692 0.706 0.664 0.934 0.529 0.730 0.527 1 0.466 0.619 0.578 1 0.477 0.795 1 0.591 1. The path models for one-dimension FA and PCA are given in Figure 8. The common factor is denominated as F1. The implied covariance matrices and the residual matrices produced by AMOS are given in Table 7. FA. PCA .02 1. Capacity. .00 1. Capacity. e1. e1 .00. .14 1. Power. .99. F1. 1. Speed. .87. e3 .53. Weight. 1. F1. e4. e2 .00. .92 1.00. .68. Power. .96. .25. .93 1.00. e2. 1. 1. Speed. .89. e3 .00. .76. Weight. 1. e4. .80. .74. .00. .45 .73. Width. 1. .80. e5. Width. Length. e5 .00. .47 1. 1. Length. e6. Figure 8 : Path model for FA and PCA. 33. 1. e6.

(34) Table 7: Car example: Implied covariance matrices and residuals produced by AMOS. FA. Capacity Power Implied Speed correlations Weight Width Length. Residuals. PCA. Capacity Power Speed Weight Width Length. Capacity Power Implied Speed correlations Weight Width Length. Residuals. Capacity Power Speed Weight Width Length. Capacity 1. Power .918 1. Speed .860 .804 1. Weight .678 .633 .593 1. Width .737 .689 .645 .508 1. Capacity 0. Power 0.036 0. Speed 0.025 0.130 0. Weight 0.014 -0.104 -0.127 0. Width -0.031 0.041 -0.026 -0.031 0. Capacity .926. Power .889 .853. Speed .853 .818 .785. Weight .738 .699 .671 .573. Width .771 .740 .710 .606 .642. Capacity 0.074 0 0 0 0 0. Power 0.065 0.147 0 0 0 0. Speed 0.032 0.116 0.215 0 0 0. Weight -0.046 -0.170 -0.205 0.427 0 0. Width -0.065 -0.010 -0.091 -0.129 0.358 0. Length .722 .674 .632 .498 .541 1 Length -0.058 -0.147 -0.054 0.297 0.050 0 Length .765 .734 .705 .602 .637 .632 Length -0.101 -0.207 -0.127 0.193 -0.046 0.368. The comparison between FA and PCA results is shown in Table 8. Table 8: Comparison between FA and PCA approaches. RESS .169 .230. FA PCA. AVE .690 .735. GFI .983 .978. For PCA, the GFI produced by AMOS has to be modified according to formula (27). The usual PCA of standardized data results in the following eigenvalues: 4.4113, .8534, .4357, .2359, .0514 ˆ is therefore measured by the and .0124. The quality of the approximation of S by λ1u1u1' + Θ δ following value of the GFI:. (32). GFI =. λ12 + ∑ ( s jj − λ1u12j ) j. q. ∑λ h =1. 2. =. 2 h. 34. 19.459 + .519 = .978 20.436.

(35) We then used AMOS 6.0 to carry out a first-order PCA of these standardized data under the hypothesis of equality of weights for the engine variables "cubic capacity, power, speed" and similarly equality of weight for the passenger compartment variables "weight, width, length". Figure 9 shows the results of this estimation and Table 9 the 90% bootstrap confidence intervals. The bootstrap intervals contain values greater than 1 because the bootstrap samples no longer consist of standardized variables. .00 1. Capacity. e1 .00. 1. Power. .92. e2. .92. .00 1. Speed. .92. e3. 1.00 .00. F1. .78. Weight. 1. e4. .78. .00. .78. Width. 1. e5 .00. Length. 1. e6. Figure 9 : PCA under constraints on the "Auto 2004" data (AMOS 6.0 output) Table 9: PCA under constraints for the "Auto 2004" data (AMOS 6.0 output) Estimation and bootstrap confidence interval for the coordinates of λx Parameter Capacity Å- F1 Power Å- F1 Speed Å- F1 Weight Å- F1 Width Å- F1 Length Å- F1. Estimate .924 .924 .924 .784 .784 .784. Inf (90%) .542 .542 .542 .555 .555 .555. Sup (90%) 1.195 1.195 1.195 1.003 1.003 1.003. The GFI for the model with constraints has the following value provided by AMOS: (33). GFI * = 1 −. S - λˆ x λˆ 'x S. 2. 2. = .9505. 35.

(36) Using the modified formula yields to: GFI = GFI * +. (34). ∑(s. jj. − λˆx2j. j. S. ). 2. 2. = .9505 +. .509 = .975 20.436. The very slight reduction of the GFI (.975 vs .978) means that one can accept the model with constraints. In this example, we obtain the component ξˆ as the "McDonald" estimation of the factor ξ, calculated as follows:. ξˆ ∝ .924(capacity* + power* + speed*) + .784(weight* + length* + width*) where the asterisk means that the variable is standardized.. III.. Confirmatory factor analysis, ULS-SEM and analysis of multi-block data. We assume now that the random column vector x breaks down into J blocks of random vectors x j = ( x j1 ,..., x jp j ) ' . A specific model with one standardized latent variable (and usual hypotheses) is constructed for each block xj: (35). x j = λ jξ j + δ j , j = 1,..., J J. This model is similar to model (4) with Λ x = ⊕ λ j . For each block j we have j=1. (36). Σ x j = λ j λ 'j + Θ δ j. and for two blocks j and k we get (37). Σ x j xk = ϕ jk λ j λ 'k. where ϕ jk = Cor (ξ j , ξ k ) . Decomposition (7) thus becomes (38). Σ = ⎡⎣⊕λ j ⎤⎦ Φ ⎡⎣⊕ λ j ⎤⎦ ' + Θδ. The parameters λ1,…, λJ, Φ and Θδ in model (38) can now be estimated by using the ULS method. ˆ minimizing the criterion ˆ and Θ This means seeking the parameters λˆ 1 ,..., λˆ J , Φ δ (39). ˆ ⎤ ˆ (⊕λˆ ) '+ Θ S − ⎡⎣(⊕λˆ j )Φ j δ⎦. 2. Adding constraint (25) gives a new criterion to be minimized: (40). ˆ (⊕λˆ ) ' S − (⊕ λˆ j )Φ j. 2. 36.

(37) This results in a new factorisation of the covariance matrix allowing an estimation to be made of both the loadings and also the correlations between the factors. The quality of the fit is still measured by the GFI criterion.. Example 3 In this example we are going to study data about wine tasting described in detail in Pagès, Asselin, Morlat & Robichet (1987). Description of the data. A collected of 21 red wines of Bourgueil, Chinon and Saumur appellations is described by a set of 27 taste variables divided into 4 blocks:. X1 = Smell at rest Rest1 = smell intensity at rest, Rest2 = aromatic quality at rest, Rest3 = fruity note at rest, Rest4 = floral note at rest, Rest5 = spicy note at rest. X2 = View View1 = visual intensity, View2 = shading (from orange to purple), View3 = surface impression. X3 = Smell after shaking Shaking1 = smell intensity, Shaking2 = smell quality, Shaking3 = fruity note, Shaking4 = floral note, Shaking5 = spicy note, Shaking6 = vegetable note, Shaking7 = phenolic note, Shaking8 = aromatic intensity in mouth, Shaking9 = aromatic persistence in mouth, Shaking10 = aromatic quality in mouth. X4 = Tasting Tasting1 = intensity of attack, Tasting2 = acidity, Tasting3 = astringency, Tasting4 = alcohol, Tasting5 = balance (acidity, astringency, alcohol), Tasting6 = mellowness, Tasting7 = bitterness, Tasting8 = ending intensity in mouth, Tasting9 = harmony These data have already been analysed using PLS in Tenenhaus & Esposito Vinzi (2005) and in Tenenhaus & Hanafi (2007). We present here the ULS-SEM solution on the standardized variables with cancellation of the residual measurement variances. First of all, we present the PCA for each separate block in Table 10.. 37.

(38) Table 10: Principal component analysis of each block for the "Wine" data Smell at rest. Smell intensity at rest Aromatic quality at rest Fruity note at rest Floral note at rest Spicy note at rest. Component 1 2 .741 .551 .915 -.144 .854 -.191 .345 -.537 .077 .933. View 1 Visual intensity Shading (from orange to purple) Surface impression. Component 2 .986 -.146 .983 -.163 .947 .320. Smell after shaking. Smell intensity Smell quality Fruity note Floral note Spicy note Vegetable note Phelonic note Aromatic intensity in mouth Aromatic persistence in mouth Aromatic quality in mouth. Component 1 2 .472 .743 .881 -.180 .819 -.176 .328 -.500 .089 .746 -.635 .593 .370 .633 .895 .277 .888 .307 .882 -.372. Tasting. Intensity of attack Acidity Astringency Alcohol Balance (acidity, astringency, alcohol) Mellowness Bitterness Ending intensiry in mouth Harmony. 38. Component 1 2 .937 .082 -.257 .691 .775 .427 .774 .378 .844 -.423 .901 -.380 .377 .760 .967 .117 .958 -.233.

(39) Use of ULS-SEM for the analysis of multi-block data. All the variables are standardized: S = R. The correlation matrix R is now approximated using criterion (40), with the aid of the following factorisation formula:. ⎡ λ1 ⎢0 R (λ1 ,..., λ 4 , Φ ) = ⎢ ⎢0 ⎢ ⎣⎢ 0. 0 λ2 0 0. 0 0 λ3 0. 0 ⎤ ⎡ 1 ϕ12 ϕ13 ϕ14 ⎤ ⎡ λ1' ⎢ 0 ⎥⎥ ⎢⎢ϕ 21 1 ϕ 23 ϕ 24 ⎥⎥ ⎢ 0 0 ⎥ ⎢ϕ31 ϕ 32 1 ϕ 34 ⎥ ⎢ 0 ⎥⎢ ⎥⎢ λ 4 ⎦⎥ ⎣⎢ϕ 41 ϕ 42 ϕ 43 1 ⎦⎥ ⎣⎢ 0. 0 λ '2 0 0. 0 0 λ 3' 0. 0⎤ ⎥ 0⎥ 0⎥ ⎥ λ '4 ⎦⎥. ⎡ λ1λ1' ϕ12 λ1λ '2 ϕ13 λ1λ 3' ϕ14 λ1λ 3' ⎤ ⎢ ⎥ ϕ 21λ 2 λ1' ϕ 23λ 2 λ 3' ϕ 24 λ 2 λ '4 ⎥ λ 2 λ '2 ⎢ = ⎢ϕ31λ 3 λ1' ϕ 32 λ 3λ '2 ϕ34 λ 3λ '4 ⎥ λ 3 λ 3' ⎢ ⎥ ' ' ' λ 4 λ '4 ⎥⎦ ⎢⎣ϕ 41λ 4 λ1 ϕ 42 λ 4 λ 2 ϕ 43λ 4 λ 3 In this way, confirmatory factor analysis (or perhaps rather confirmatory PCA), in the context described here, allows the best first order reconstruction of the intra- and inter-block correlations. First analysis. Using AMOS 6.0, we obtained Table 11 and the diagram shown in Figure 10. Confirmatory factor analysis of the four blocks echoes in essence the results of the first principal components of the separate PCA’s for each block. We have put the significant loadings in bold in Table 11. They well correspond to the strongest variable×PC1 correlations given in Table 10. There are two exceptions: Smell intensity at rest in block 1 and Astringency in block 4. It should be noted that these variables have fairly high correlations with the second principal components. The GFI is less than 0.9. This is due to the existence of the second dimensions for blocks 1, 3 and 4. Second analysis. In order to better identify the first dimension of the phenomenon under study, it is usual in confirmatory factor analysis to "purify" the scales: the analysis is repeated, omitting the nonsignificant variables. This is where Table 12 and Figure 11 come from. All the correlations between the manifest and latent variables of the corresponding block, and the correlations between the latent variables, are now strongly positive. All the correlations are significant. The first dimension of the phenomenon under study has therefore been perfectly identified. The GFI of 0.983 is excellent and well shows the unidimensionality of the selected variables.. 39.

(40) Table 11: Confirmatory factor analysis of the "Wine" data (AMOS 6.0 output) (Significant coefficient in bold, non-significant in italic). Parameter. Estimate. Inf (95%). Sup (95%). P. Smell intensity at rest. <---. Rest 1. 0.623. -0.768. 0.993. 0.170. Aromatic quality at rest. <---. Rest 1. 0.944. 0.176. 1.19. 0.018. Fruity note at rest. <---. Rest 1. 0.808. 0.162. 1.128. 0.010. Floral note at rest. <---. Rest 1. 0.529. -0.416. 0.935. 0.108. Spicy note at rest. <---. Rest 1. -0.003. -1.064. 0.716. .... Visual intensity. <---. View. 0.951. 0.424. 1.319. 0.046. Shading. <---. View. 0.931. 0.486. 1.256. 0.045. Surface impression. <---. View. 1.028. 0.266. 1.375. 0.036. Smell intensity. <---. Shaking 1. 0.614. -0.612. 1.083. 0.183. Smell quality. <---. Shaking 1. 0.828. 0.347. 1.062. 0.017. Fruity note. <---. Shaking 1. 0.752. 0.146. 1.042. 0.018. Floral note. <---. Shaking 1. 0.240. -0.658. 0.849. 0.353. Spicy note. <---. Shaking 1. 0.264. -0.655. 0.864. 0.253. Vegetable note. <---. Shaking 1. -0.570. -0.999. 0.114. 0.096. Phenolic note. <---. Shaking 1. 0.392. -0.219. 0.741. 0.200. Aromatic intensity in mouth. <---. Shaking 1. 0.928. 0.176. 1.237. 0.041. Aromatic persistence in mouth. <---. Shaking 1. 0.955. 0.105. 1.299. 0.024. Aromatic quality in mouth. <---. Shaking1. 0.801. 0.175. 1.040. 0.020. Intensity of attack. <---. Tasting 1. 0.897. 0.105. 1.334. 0.016. Acidity. <---. Tasting 1. -0.208. -1.012. 0.546. 0.595. Astringency. <---. Tasting 1. 0.808. -0.276. 1.155. 0.076. Alcohol. <---. Tasting 1. 0.809. 0.104. 1.251. 0.040. Balance. <---. Tasting 1. 0.841. 0.129. 1.168. 0.026. Mellowness. <---. Tasting 1. 0.893. 0.224. 1.207. 0.021. Bitterness. <---. Tasting 1. 0.373. -0.789. 0.851. 0.326. Ending intensity in mouth. <---. Tasting 1. 0.969. 0.195. 1.336. 0.018. Harmony. <---. Tasting 1. 0.958. 0.295. 1.312. 0.014. Estimate. Inf (95%). Sup (95%). P. Rest 1. Parameter <-->. View. 0.724. 0.069. 0.866. 0.027. Rest 1. <-->. Shaking 1. 0.866. 0.725. 0.960. 0.010. Rest 1. <-->. Tasting 1. 0.736. 0.578. 0.863. 0.010. View. <-->. Shaking 1. 0.827. 0.218. 0.950. 0.026. View. <-->. Tasting 1. 0.887. 0.201. 0.962. 0.020. Shaking 1. <-->. Tasting 1. 0.916. 0.764. 0.968. 0.010. GFI. .849. 40.

(41) .00. .00. e1 1. .00 e20 1. e3. e2 1. .00. .81.94. rest4. .53. .00. View .87 shaking1 .83. 1 shaking2. .61 .83. 1 shaking3. e9 .00 1 e8 .00 1 e21 .00. shaking4. shaking6. .89. .80. .96. e24 1. shaking9. e23. 1.00. tasting7. .84. .00 e25 .00. e14 .00. 1 e15. tasting4 1 .00. 1 e18. e26. 1. 1 tasting6. tasting5 tasting3. .00. .37. .81. .81. tasting2 1 .00. tasting8. .89 -.21. e19. 1. .97. Tasting 1. .90. tasting1 1 .00. e27. .96. shaking8 .00. 1 tasting9. 1 .00. 1. .00. .92. shaking10. shaking7. e22. 1.00 Shaking 1. .26 -.57 .39 .93. shaking5. .74. .75 .24. 1. e10 .00 1. .93 .95 1.00. 1.03. .72. 1. e12 .00 e11. view1. view2. Rest 1. e13 .00. .00. view3. 1.00. .00. rest5. 1. 1. rest1. .62. e7. e6 1. 1. rset2. .00. .00. e5. e4. 1. rest3. .00. .00. .00. e16. e17. Figure 10 : Confirmatory factor analysis of the "Wine" data (AMOS 6.0 output). 41.

(42) Table 12: Confirmatory factor analysis of the "Wine" data on the significant variables (p-value < .05) of Table 11 (AMOS 6.0 output). Parameter Aromatic quality at rest Fruity note at rest Visual intensity Shading Surface impression Smell quality Fruity note Aromatic intensity in mouth Aromatic persistence in mouth Aromatic quality in mouth Intensity of attack Alcohol Balance Mellowness Ending intensity in mouth Harmony Parameter Rest 1 <--> View Rest 1 <--> Shaking 1 Rest 1 <--> Tasting 1 View <--> Tasting 1 View <--> Shaking 1 Shaking 1 <--> Tasting 1. <--<--<--<--<--<--<--<--<--<--<--<--<--<--<--<---. Rest 1 Rest 1 View View View Shaking 1 Shaking 1 Shaking 1 Shaking 1 Shaking 1 Tasting 1 Tasting 1 Tasting 1 Tasting 1 Tasting 1 Tasting 1 Estimate 0.645 0.870 0.690 0.837 0.792 0.897. Estimate 0.992 0.885 0.942 0.924 1.039 0.885 0.822 0.917 0.923 0.854 0.891 0.795 0.885 0.930 0.960 0.975 Inf (95%) 0.303 0.694 0.302 0.411 0.329 0.752 .983. GFI. 42. Inf (95%) Sup (95%) 0.649 1.197 0.510 1.181 0.539 1.311 0.623 1.243 0.536 1.386 0.569 1.090 0.417 1.076 0.385 1.246 0.363 1.288 0.581 1.065 0.346 1.337 0.247 1.232 0.432 1.173 0.496 1.222 0.438 1.327 0.497 1.323 Sup (95%) 0.806 0.948 0.842 0.948 0.921 0.957. P 0.01 0.01 0.01 0.01 0.01 0.01. P 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.02 0.01 0.01 0.01 0.01.

(43) .00. .00. .00. e5. e3. e2. .89. .99. 1. 1. view3. rest2. rest3. e7. e6. 1. 1. 1. .00. .00. view1. view2 .94. 1.04 .92. .65. 1.00. 1.00. View. Rest 1 .87 .00. .79. 1. e12. shaking2 .88. .00. 1 shaking3. e11. .69 1.00. .82. Shaking 1 .00. .92 1 shaking8. e22. .92. .84 .85. .00 1. e23. .90 shaking9. .00 e24. 1. 1.00. shaking10. .97 Tasting 1. e27. .96. .89 .93. .00. 1. tasting8. e26. .89. .79 tasting1. tasting6. 1. 1. .00 e19. tasting5 1 .00. tasting4 1 .00. .00 e14. e15. e16. Figure 11: Confirmatory factor analysis of the "Wine" data on the significant variables of Table 9 (AMOS 6.0 output). 43. .00. 1. tasting9.

(44) Third analysis. As the four LV’s appearing in Figure 11 are highly correlated, it is natural to summarize these LV’s through a second order confirmatory factor analysis. This yields to Figure 12. The regression coefficient of one MV of each block has been put to 1. The second order LV “Score 1” is similar to the standardized first principal component of the first order LV’s as the error variances have been put to zero. The first order LV’s are evaluated by using the McDonald approach. For example, using the path coefficients shown in Figure 12, we get: Score(Rest 1) ∝ 1× rest2* + .88 × rest3*. view1. view2. view3. 1.00. 1.12 .98. 1.00. .88. 1. 1. 1. rest2. rest3. e7. e6. e5. 1. 1. .00 1. .00. d1. Rest 1. .00. .00. e3. e2. .00. .00. .00. .00. 1. View. .83. d2. .84. 1. e12. shaking2 1.00. 1.04 .00. 1. .96. shaking3. e11. Score 1. .82 Shaking 1. .00. 1.08 1 shaking8. e22. 1. 1.09. .00. 1.00. .00. d3. 1. e23. .94. shaking9 .00. e24. 1. shaking10 1. .00. 1.00. 1 Tasting 1. d4. tasting9 .99. .91 .95. 1. tasting8. .00 e26. .91. .82. .00 e27. tasting1 tasting6. 1. 1. .00 e19. tasting5 1 .00. tasting4 1 .00. .00 e14. e15. e16. Figure 12: Second order confirmatory factor analysis of the "Wine" data on the significant variables of Table 9 (AMOS 6.0 output). 44.

No results found