• No results found

Frequently Used Methods for Measuring Cognitive Load

2. THEORETICAL BACKGROUND AND REVIEW OF THE LITERATURE

2.6 DIFFERENT METHODS OF MEASURING COGNITIVE LOAD

2.6.1 Frequently Used Methods for Measuring Cognitive Load

The most frequently used methods for measuring cognitive load are based on the analysis of learning outcome measures or self-reported invested mental effort classified by Brünken et al. (2003) as indirect indicators of cognitive load. One example to analyze learning outcome measures is the use of transfer performance as an indicator for invested germane load. DeLeeuw et al. (2008) in their recently published study compared different measures of cognitive load that indicates evidence for separable measures of intrinsic, extraneous and germane load. As germane load is not induced by variation of the instructional design in this study, but only concluded by post hoc estimations based on transfer performance, this experimental design is not robust against confounds of cognitive load with other factors. Powerful factors, which can moderate performance in multimedia learning could also be prior knowledge, memory skills or spatial ability. In such a post hoc design no manipulation check is possible and it is questionable if this measure could really be an indicator of the level of germane load that is the invested by the learner.

An example for the use of self-reported invested mental effort is given by a study of van Gog, Paas and van Merriënboer (2006) where effects of process-oriented worked examples on troubleshooting transfer performance were examined. The authors expected to confirm that learning with worked examples results in more effective learning indicated by higher transfer performance and less self-reported invested mental effort as well as less time- on-task. Moreover, an interaction effect was hypothesized in the way that presenting process information leads to an increase in the investment of mental effort in both conditions, the worked example and the problem-solving condition, but resulted in different levels of transfer performance. In the worked example condition, the higher mental effort resulted in higher transfer performance indicating germane cognitive load which could only be interpreted as such because of the low extraneous loading condition. In contrast, in the high extraneous loading conventional problem-solving condition, the increase of mental effort resulted in reduction of transfer performance indicating in this case high extraneous load resulted from the problem-solving activity. With the indirect measure of overall mental effort as frequently used method in cognitive load research, which could be confirmed to be highly reliable (Paas et al., 2003), the hypotheses of this study could not be appropriately proven because this type of cognitive load measure do not differentiate between different load sources. The three sources of cognitive load in Cognitive Load Theory: intrinsic, extraneous and germane load result in a dynamic framework and therefore the combination of reached transfer performance and the self-reported invested overall mental load does not provide enough information for an

interpretation of the dynamically acting load types. A lack of learning benefits, for instance, could stem from compensatory effects between germane and extraneous cognitive load, or between germane and intrinsic cognitive load or even from an overall cognitive overload indicating that the sum of intrinsic, extraneous and germane cognitive load exceeded the overall working memory capacity. This is only one example representing the limitations of cognitive load measures analyzed, reflected and discussed in more detail by Moreno (2006), which shows that this type of cognitive load measure is not adequate for a discriminating analysis of cognitive load effects. It is nevertheless one overall useful indicator of total cognitive load.

When introducing this mental effort scale one should be aware of the subjectivity of measured cognitive load. Beside the well-known methodological problems of subjective ratings like the central tendency error or the phenomenon of social desirability or response behavior oriented at the expectations of the research team (Sassenberg & Kreutz, 1999). It is in general questionable if learners are able to estimate their invested mental effort as they do not feel when they are cognitively overloaded (Gimino, 2000). However, this is what is assumed by Paas and van Merrienboer referring to findings of Gopher and Braune (1984) that subjects are able to introspect on their cognitive processes and have “no difficulty in assigning numerical values to the imposed mental load” (Paas & van Merriënboer, 1994, pp. 126). This is a main assumption when using this subjective mental load scale invented by Paas (1992). In addition, if learners were able to estimate their invested mental effort in a post-treatment questionnaire, it still would remain unclear, how this rated effort relates to actual cognitive load during the treatment. A low rated amount of invested mental effort could be on the one hand a result of low cognitive load, on the other hand a result of such a high load that the learner decreased the mental effort of comprehending the materials (Reed, Burton, & Kelly, 1985). One more aspect that has to be considered is that in general a within-subject design should be recommended, especially for indirect and subjectively rated cognitive load, as it is also questionable, if it is possible to consider between-subjects differences on the same dimension as within subjects differences.

These two frequently used methods of measuring cognitive load, the analysis of learning outcome measures and self-reported invested mental effort, are different in the way that performance measures are classified as an objective method and self-reported invested mental effort is classified as a subjective measure. Another comparable subjective measure is self-reported difficulty of materials that is associated with a direct measure of cognitive load. This could overcome the lack of information of other subjective measures mentioned by

Brünken et al. (2003). However, all subjective measures mentioned here are perhaps confounded with each other as a rating of mental effort could be influenced by the participants’ feeling of stress level or arousal. We do not really know what participants estimate when answering on one item asking for the level of invested mental effort, stress or difficulty of material. This is the main counterargument to use one of the subjective methods of measuring cognitive load. The introduction of these methods should be considered very carefully and conclusions drawn from the overall estimation of cognitive load during learning should only concern this overall measured cognitive load. Specific hypothesis about different load types could not be tested by such operationalizations, which only include a general self- reported cognitive load measure. In sum, internal validity of these subjective methods summarized here is questionable and post hoc interpretations of the objective method to measure some learning outcome is no solution, since powerful moderating variables are hardly controllable in the respective experimental design.

More recently, possibilities to measure cognitive load in a direct way are focused and methods were developed and have empirical support. One example is the use of the dual-task paradigm that is normally utilized in research on executive functions measuring the capacity to switch between two comparably demanding tasks. This dual-task method is appropriate to examine general information processing of humans assuming that the management of two comparable tasks is dependent on cognitive organization, especially in working memory. In this part of human cognitive architecture, the rules of the two given tasks should be kept activated and some activation as well as inhibition processes are assumed to be located, relevant for successful switching between two tasks. In such a dual-task paradigm the above mentioned function of long-term working memory is also assumed to play an important role. The complete rule of one task has to be retrieved of the long-term memory at the right time, which could be organized more efficiently by a retrieval cue in long-term working memory. This dual-task paradigm has been adapted for complex learning situations where a primary task, the learning process, is accompanied by a less demanding secondary task that is sensible for measuring the level of cognitive load needed in the primary task. Numerous operationalizations were invented for a secondary task, which is slightly demanding, but not fully disrupting the execution of learning processes. For instance, the task to respond as soon as possible to a presented cue in the learning material, such as the color change of a letter presented on the margin of the screens or a simple auditory stimulus, was introduced in studies by Brünken (Brünken et al., 2002; Brünken et al., 2004). This type of method has also been used by other research groups (Chandler et al., 1996; Marcus et al., 1996). Yet is often

criticized by its attracting function, splitting the attention of the learner who is at the time of the cue presentation reading a text or inspecting a picture in a multimedia presentation and really disrupted in the primary task.

This clear disruption could be reduced by using changes in background color of the learning material (DeLeeuw et al., 2008), where no real disruption of the primary task is needed when the color is slightly changing over the complete viewing area. However, in this operationalization the learner will also be visually distracted in the moment of the color change. These methods could not really be classified as continuous one as described by Brünken et al. (2002) because learners are split in their attention as the visual secondary task is demanding resources from the visual primary task, when learning from a multimedia instruction including text and pictures or only pictures if the audio channel is used for text presentation. Thus, a sort of switching between the two tasks is necessary to overcome the demanding execution of both given tasks. However, registered reaction times of the secondary task highly correlated with indicators of knowledge acquisition. Learners who processed information from an instructional design, assumed to be superior for learning, produced shorter reaction times in the secondary task indicating a lower level of cognitive load. These objective methods are nevertheless problematic in the sense that only specific aspects of cognitive load are measured.

This disadvantage is at the same time fruitful as it can be determined which aspect is measured by the secondary task because of its modality specific construction. These tasks thereafter do not registrate the overall invested mental effort neither do they measure the overall invested extraneous load, but a part of cognitive load which could be directly lead back to modality, that is the cognitive load due to visual or acoustic processing in working memory (Brünken et al., 2004). These modality specific cognitive load methods are introduced and proved by the mentioned studies of Brünken et al. (2002) using visual cues (color change of a letter) and Brünken et al. (2004) using acoustic cues (tone) in a secondary task of detecting the color change or the presentation of a tone. In both studies the modality effect could be confirmed in the patterns of the modality specific secondary task performances and in the primary learning task. Participants, who learned from audiovisual material had more capacities free for processing the visual secondary task than those working with the visual-only material (Brünken et al., 2002). In contrast, participants, who worked with audiovisual material had less capacities free for processing an acoustic secondary task than in the case of working with visual-only material (Brünken et al., 2004).

In summary, the above mentioned frequently used methods for measuring cognitive load are useful instruments, but not as yet validly serving for a differentiation of the three components of cognitive load, namely intrinsic, extraneous and germane load. Thus, the theoretical construct to measure factors in cognitive load research studies should be carefully described. In addition, operationalizations should be clear and coherent with the introduced theoretical construct. One attempt to differentiate between the three cognitive load types has been recently realized by DeLeeuw et al. (2008) showing that some methods for measuring cognitive load are more adequate for one of the three load types. The results of a combination of three different methods for measuring cognitive load indicated that response time measure is most sensitive to manipulations of extraneous processing, ratings of invested mental effort are most sensitive to manipulations of intrinsic processing, and material difficulty ratings are most sensitive to indications of germane processing. However, there are some methodological limitations of this study. DeLeeuw et al. (2008) estimated the invested germane cognitive load by the observable transfer performance, which is not the best indicator for cognitive load as already discussed above. Moreover, it can not clearly be concluded from a low correlation between introduced methods of measuring cognitive load, namely dual-task, self-reported invested mental effort and performance measures that the overall cognitive load should therefore consist of different components. Thus, the resulting conclusion that for these different components the introduced methods are needed is very plausible, but still has to be confirmed in empirical studies which indeed induce the different load types. The only conclusion that could already be clearly drawn from this study is that some methods have a higher level of expressiveness for certain components of cognitive load. This is due to the specific construction of the respective invented instrument, as has been concluded by Brünken et al. (2004) and Ayres (2006).