CHAPTER 3. METHODOLOGY
3.7 Syntactic Complexity Measures Investigated
The investigation of syntactic complexity has been a long-standing concern in first and second language research. However, due to the multitude of indices used to measure the complexity (see Ortega, 2003; Wolfe-Quintero et al., 1998), there is not a consensus on “what the construct is and what measures are appropriate” (Yang et al., 2015). Some studies have described complexity as a multi-dimensional construct and called for research that use measures to capture the multi-dimensionality (e.g., Bulté & Housen, 2012; Norris & Ortega, 2009). Norris and Ortega (2009) recommends employing both global complexity and more local measures that tap the subdomains of language such as coordination, subordination, and phrasal structures. The most recent approach that addresses these recommendations was developed by Lu (2010, 2011) and incorporated both global and local measures capturing the multi-dimensional nature of syntactic complexity. A growing amount of research has employed those syntactic complexity measures in recent years (e.g., Mancilla et al., 2017; Mazgutova & Kormos, 2015, Vyatkina, 2013, Yang et al., 2015). Based on the model proposed by Norris and Ortega (2009) and multidimensional operationalization of complexity in recent years, a number of
recommendations were taken into consideration to select the most appropriate measures in this study:
1. One measure was used to measure subordination based on the recommendations of Norris and Ortega (2009), who claim that subordination “metrics are all equivalent, regardless of the denominator of choice, in that they all tap complexification as a phenomenon of subordination exclusively” (p. 560). In this way, multicollinearity of dependent variables was avoided as suggested by Tabashnick and Fidell (2013). To check if there is
multicollinearity among the subordinations measures, a Pearson correlation was run and found that dependent clauses per clause and dependent clauses per T-unit showed a very
high degree of correlation, .983. Therefore, to avoid redundancy and multicollinearity, dependent clauses per clause was used in this study.
2. The clause was considered as the unit of measurement where appropriate rather than the T-unit, as Lu (2010, 2011) found that the clause is a better discriminator than the T-unit across the levels.
3. As per Norris and Ortega’s (2009) call for employing measurement practices conforming to the “construct reality of multidimensionality,” the first research question was
addressed by metrics tapping global or general complexity, complexity for clausal subordination, and complexity for phrasal elaboration. In a similar vein, the second research question was addressed by finer-grained individual linguistic features to capture both phrasal and clausal complexity.
4. In addition to more global measures, a more specific measure was also used for phrasal complexity that is not captured well by general measures as suggested by previous studies (e.g., Byrnes, 2009; Lu, 2011; Vyatkina, 2013). In fact, complex nominals per clause was used to capture the nominal complexity as academic writing is highly characterized by nominal discourse. Similarly, verb-phrases per T-unit was used to capture nonfinite verb constructions as these are not counted as clauses in regular T-unit calculations in mainstream SLA studies (see Lu, 2010 and Vyatkina, 2013 for details). 5. Length based measures (e.g., mean length of T-unit, mean length of sentence, mean
length of clause) are mostly targeted towards more global complexity. However, mean length of clause is different from the other length-based measures due to the fact that it captures “a more narrowly defined subclausal complexity at the phrasal level” (Norris &
Ortega, 2009, p. 561). For this reason, mean length of clause was employed to maximize the complexification sourced by phrasal elaboration at clause level.
6. To capture an overall picture of complexity, a length-based measure that is listed by previous studies (e.g., Lu, 2010, 2011; Ortega, 2003) was targeted. In fact, there were three options: mean length of T-unit, mean length of sentence, and mean-length of clause. As the latest one is more suitable for subclausal complexity as indicated in #5 above, I examined the correlation between mean length of T-unit and mean length of sentence. It was a fairly high degree of correlation, .958. Therefore, I left out one to avoid
redundancy, and selected mean length of T-unit, considering that it is one of the most commonly used measures in the literature (Yang et al., 2015). Additionally, the findings provided by Yang et al. (2015) corroborate using mean length of T-unit as a general complexity measure.
Based on the points elucidated above, the following figure describes the multi-dimensional representation of syntactic complexity operationalized in the present study.
Figure 3.4 Multi-dimensional representation of syntactic complexity (Adapted from Yang et al. (2015)
To sum up, as per the recommendations put forward by prior studies explained in detail above, the current study employed the following complexity measures listed in Table 3.4.
Table 3.4 Syntactic Complexity Measures Investigated in the Study
Sub-construct Measure Definition
Global complexity Mean length of T-unit (MLT) Number of words divided by number of T- units
Elaboration at clause
level Mean length of clause (MLC)
Number of words divided by number of clauses
Clausal subordination Dependent clauses per clause (DCC)
Number of dependent clauses divided by number of T-units
Phrasal coordination Coordinate phrases per clause (CPC)
Number of coordinate phrases divided by number of clauses
Noun phrase complexity
Complex nominals per clause (CNC)
Number of complex nominal divided by number of clauses
Non-finite verb
complexity Verb phrases per T-unit (VPT)
Number of nonfinite verb phrases divided by number of T-units
Note: Adapted from Yang et al. (2015)
Global complexity Mean Length of T-Unit (MLT) Elaboration at Clause Level Mean Length of Clause (MLC) Phrasal Coordination Coordinate Phrases per Clause (CPC) Noun Phrase Complexity Complex Nominals per Clause (CNC)
Verb phrases per T-unit Nonfinite verb complexity (VPT) Clausal Subordination Dependent Clauses per Clause (DCC)