Technology Society
3.5 Confirmatory Research Phase .1 Method .1 Method
3.5.4 Explanatory Stage Data Preparation and Analysis
3.5.4.1 Structural Equation Modelling
Lomax and Schumacker (2004) describe structural equation modelling as a technique which uses various types of models to depict relationships among observed variables with the same basic goal or providing a quantitative test of a theoretical model hypothesized by a researcher.
Hubona (2010) describes a two level concept of empirical research in which the researcher provides an observational plane (or measurement model; the items which were measured) and the theoretical plane which comprises the structural model, as depicted in figure 3.4.
Figure 3.4 Two level concept of empirical research as it relates to a structural model
The two level concept of empirical research (Hubona, 2010) well illustrates the form of a structural equation model. The measured latent variables are within the observational plane
Item 1
Simplified model for illustration purposes Measurement model –
Observational plane
Structural Model
Theoretical Plane Measurement model – Observational plane
whereas the relationships within the model are on the theoretical plane. That is, the items measuring the latent variables (either exogenous or endogenous) are observed, or measured, whereas the relationships within the model are those proposed or theorized by the
researcher, through mapping hypotheses to these relationships. While there may be relationships between the measured items, the relationship between the latent variables is what the researcher is striving to quantify. The researcher thus tries to draw conclusions about relationships within the plane of theory, based on observations which are made or seen in the plane of observation. This relates to PLS path modelling in that the observed items and corresponding latent variables for both endogenous and exogenous variables make up the outer model portions, whereas the inner model comprises the relationships between the latent variables (Hubona, 2010). The outer model relates the latent variables to the measurement items (those on the observation plane), whereas the inner model
represents the theoretical plane. The measurement items associated to a latent variable are termed a block of items.
A differentiation is drawn between first and second generation structural equation modelling techniques. Bagozzi and Fornell (1982) describe LISREL and Partial Lease Squares (or PLS) as second generation analysis techniques which can be used to test the extent to which research meets standards for high quality statistical analysis, or statistical conclusion validity (Cook and Campbell, 1979). In contrast to first generation statistical analysis tools such as regression, Gefen et al. (2000) and Gerbing and Anderson (1988) write that SEM enables researchers to answer a set of interrelated research questions in a single, systematic and comprehensive analysis through modelling relationships among multiple independent and dependent constructs simultaneously. This provides a differentiation between SEM and first generation regression techniques such as linear regression, Anova and Manova which enable the analysis of only a single level of linkages between independent and dependent variables at one time. Gefen et al. (2000) continue that SEM, in contrast to first generation regression analysis methods, assesses both the
structural model as well as the measurement model, and that this combined analysis permits
model, and further enables factor analysis to be combined with hypothesis testing in the same operation. Gefen et al. (2000) thus support that SEM techniques are more able to provide information about how well the research model is supported by the data than by using regression techniques. As such, the use of SmartPLS as a second generation SEM analysis tool seemed to offer a better analysis method and was chosen by the researcher.
Already during the 1980s, Kenny and Judd (1984) wrote that the use of structural models with latent or unmeasured variables was increasing in the social sciences, as the modelling it enables allow researchers to estimate coefficients of linear models whilst controlling for measurement error. More recently, Lomax and Schumacher (2004) indicate that there are four major reasons for the popularity of SEM. First, the complexity of models is increasing as modelling phenomena with multiple observed variables enables researchers to better understand their area of inquiry. Secondly, structural equation modelling takes into account measurement error, which had previously been treated separately than the statistical
analysis of data. Additionally, using advanced SEM techniques, interaction terms can be included which enables moderating effects to be more accurately measured. Lomax and Schumacher (2004) note also that SEM software programs have become more user friendly in recent years, eliminating the need for researchers to understand complex programming techniques. The researcher took these points into consideration in the choice of SmartPLS as an analysis tool.
A comparison of structural equation modelling techniques as discussed by Gefen et al.
(2000) is shown in the following table 3.2:
Issue LISREL PLS Linear Regression
Objective of overall number of items in the most complex
Table 3.2 Comparison of Structural Equation Modelling Techniques
Hubona (2010) summarizes the comparison of approaches between PLS and covariance based equation modelling in the following table:
Basis of Comparison PLS Based SEM Covariance based SEM
Objective Prediction Orientated Theory oriented: Parameter Oriented
Approach Variance Based Covariance Based
Assumptions Predictor specific (non parametric).
Implications Optimal for prediction accuracy
Optimal for parameter accuracy
Model Complexity Large complexity (e.g. 100 constructs, 1000 indicators)
Small to moderate
complexity (e.g. fewer than 100 indicators)
Sample Size Power analysis based on the portion of the model with the highest number of predictors for the minimum number of observations from 200 to 800.
Table 3.3 Summary of difference between PLS Modelling and Covariance Based approaches
Henseler et al. (2009) state that PLS modelling, specifically, provides four genuine
advantages: it can be applied when distributions are highly skewed (Bagozzi and Yi, 1994) which differentiates it from covariance based techniques which carry more stringent distribution requirements. Secondly, PLS can be used to estimate relationships between latent variables when the number of observations is small, as PLS uses separate ordinary least square (OLS) regressions for each subpart of the research model, and as a result the complexity of the model hardly influences sample sizes. In addition to the ease of use of modern PLS software, Henseler finally cites that PLS holds advantages in model analysis when improper or non convergent results may occur, such as in more complex models, or where the number of latent variables is high in relation to the number of observations, and the number of indicators per latent variable is low. Tenenhaus and Hanafi (2005) reiterate these advantages, stating that PLS has also some advantages over covariance-based SEM namely, systematic convergence of the algorithm due to its simplicity, possibility of managing data with a small number of individuals and a large number of variables, and offering a general framework for multi-block analysis. These aspects, especially the ability to analyse more complex models, or where the number of latent variables is high in relation to the number of variables were important in the researcher’s choice of SmartPLS as an analysis tool. As the researcher had been forewarned that it would be difficult to gain access to a large number of potential data sources, a tool with lower sample size
requirements was deemed important. Additionally, the lack of parametric requirements was important.
In summary, and supported by other published applications of PLS as summarized by Ringle et al (2012) the researcher primarily chose PLS as the preferred analysis method because:
a tool with lower sample size requirements was important
the lack of parametric requirements - important due to potential smaller sample size
SEM can be applied when distributions are highly skewed (Bagozzi and Yi, 1994) which differentiates it from covariance based techniques which carry more stringent distribution requirements
the complexity of the model hardly influences sample sizes
simplicity of software implementation (SmartPLS program)