• No results found

3. Methodological foundation

3.2 Methods for studying dream affect

3.2.2 Typical methods for measuring dream affect

In dream research, external ratings (ER)—or third-person ratings—have been the traditional method for measuring dream affect. With this method, narrative (written or oral) dream reports are collected and content analysed by ‘blind’ external judges, using a particular scale.

Typically, ER are conducted using a detailed content analytic scale: a nominal scale with which the judges identify and classify every occurrence (or presence) of an affective state (emotions and moods are not distinguished) explicitly mentioned in the dream report into one of the discrete affect categories. Among the existing scales, one of the most widely used is the Hall and Van de Castle (1966) Emotions scale, which consists of five categories of discrete affect: anger, apprehension (e.g., fear, anxiety), sadness, happiness, and confusion. Using this scale, Hall and Van de Castle collected 1,000 dream reports from students (five each from 100 women and 100 men) in the US between 1947 and 1950. Results based on these reports are often considered the ‘norms’ with which to compare findings from other studies. However, this scale is problematic because it has more negative affect (anger, apprehension, sadness) than positive affect (happiness) categories. Also, confusion has been classified as a negative affect category, which is questionable, since it contains affective states having both positive (e.g., amazed, awestruck) and undetermined (e.g., surprised) valence. As a result, with this scale ratings of dream affect may be biased towards a more negative valence.

In other studies, ER are conducted using global rating scales: ordinal (Likert) scales with which judges rate the whole dream report on some particular affective dimensions, such as the intensity of positive affect, the intensity of negative affect (Schredl & Doll, 1998), or overall affective intensity (e.g., De Gennaro et al., 2011). Whereas with detailed content analysis, only the presence of explicitly expressed affective states are coded, with global dimensional rating scales, the implicitly expressed affective states and their intensity are also rated. However, some researchers question the reliability and validity of intensity ratings performed by judges (Domhoff, 1996) and urge caution in inferring affective states from the reports (Windt, 2015).

ER are often considered ‘objective’ because the use of narrative reports makes it possible for other researchers to reproduce the results (Domhoff, 1996; Schredl, 2018). The use of at least two different judges enables the reduction of possible experimenter effects and the calculation of inter-rater reliability. However, even if high inter-rater reliability ratings are achieved, this does not necessarily mean that the ratings are also valid. For example, if no affective states are explicitly expressed in the dream report (e.g., “we played with the child”; “the beast was following me”), it is not possible to know whether this is because the dreamer did not experience any affective states in the dream or because the person simply failed to report these states.

In narrative dream reports people describe their dream experiences as they remember them happening, using freely chosen words that they naturally use, and they are not confined to the items preselected by the researcher. However, such reports depend on the reporting and language skills of the participants (Kahan, 2012). It can be challenging to express complex experiences in words. Moreover, narrative reports follow a story-schema: participants are more likely to report some features of the content (e.g., what happened, where, who was present) than others (e.g., how they felt) (Kahan, 1994; Kahan & Horton, 2012; Kahan & LaBerge, 1996, 2011; Merritt, Stickgold, Pace-Schott, Williams, & Hobson, 1994). This means that narrative reports may specifically underrepresent affective dream experiences (Kahan & Claudatos, 2016; Kahn & Hobson, 2002; Strauch & Meier, 1996). Even if participants are explicitly instructed to describe their affective experiences, they may have difficulty labelling them.

3.2.2.2 Self-ratings of dream affect

Self-ratings (SR)—or first-person ratings—involve the collection of targeted (or affirmative) probes (Hobson & Stickgold, 1994; Nielsen, 2010). Targeted probes can

take the form of affect items, affect rating scales, or specific questions about dream affect, which the participants (i.e., the dreamers) are asked to rate or answer. Typically, participants are asked to provide global ratings of the preceding dream experience—the extent to which they experienced certain affective states in the dream as a whole (e.g., Blagrove, Farmer, & Williams, 2004; Schredl & Reinhard, 2009-2010). Less frequently, they are asked to rate each line (e.g., Fosse et al., 2001; Merritt et al., 1994) or scene (Nielsen, Deslauriers, & Baylor, 1991) in the corresponding narrative dream report.

A wide variety of rating scales are used for global SR. Discrete affect scales include lists of different affect items, and the participants are asked to rate the presence and intensity of each item in the preceding dream (e.g., Fosse et al., 2001; Kahn & Hobson, 2002; Kahn, Pace-Schott, & Hobson, 2002; Merritt et al., 1994; Nielsen et al., 1991). Dimensional scales range from those encompassing a single bipolar scale, with which participants must rate the overall affective tone or valence of the dream (e.g., Blagrove et al., 2004), to those including several unipolar scales that separately assess the intensity of positive affect and negative affect (e.g., (Schredl, 2018; Schredl & Reinhard, 2009-2010).

The wide variety of scales used for SR makes it difficult to compare results from different studies. For example, some discrete affective states (e.g., awe) may appear relatively infrequently across different studies not because they are seldom experienced, but because they are seldom measured. Also, the choice of a particular scale is rarely justified. Surprisingly, empirically validated affect rating scales, such as those used in emotion research, have seldom been used (e.g., Stairs & Blick, 1979). Instead, studies use different lists of ad hoc affect items, which poses a threat to validity. As with ER, the scales often consist of an unequal number of positive and negative affect items—with the number of negative affect items typically exceeding the number of positive affect items—which may inadvertently bias the results towards increased negativity by way of participants rating the items simply because they are presented.

On the one hand, the use of targeted probes ensures that participants report experiences that may otherwise be left out of the narrative report, particularly affect experienced in the dream. On the other hand, the particular questions or affect items used may bias the results. For example, participants may choose an item they originally might have not chosen, they might want to choose an item that cannot be found on the list/scale, or the label used by the researcher and how the participant understands it may differ (Scherer, 2005). Furthermore, the ratings may be influenced by waking cognition at the time of rating—participants may incorrectly

assume that they felt a certain way based on the dream content (Domhoff, 2005; Foulkes, Sullivan, Kerr, & Brown, 1988; Kahn & Hobson, 2002; Zadra & Domhoff, 2017). When Likert-type scales are used, systematic response biases, such as the tendency to only select extreme endpoints of the scale or to agree with all items, can influence the results (Baumgartner & Steenkamp, 2001; Paulhus, 1991; Paulhus & Vazire, 2007).