Rater Affect - Reference reports : a meta analytic review of predictive validity and an experim

Results from the regression analysis provide ample evidence for the influence of rater affect. Positive rater affect was associated with higher ratings, and negative affect was associated with lower ratings. Raters who expressed liking for the ratee were more lenient in their ratings compared to raters who

expressed neutral or antagonistic feelings toward the ratee. These results are all the more compelling given the meagre information available to raters. Descriptions of the hypothetical ratee were brief, and contained mostly factual statements related to job performance. Yet, even under such artificial

conditions, raters formed affective responses, and these responses were systematically associated with rating outcomes. It is reasonable to surmise that the influence of rater affect might be even stronger in situations where the ratee was known to the rater, and the relationship between the two was well established.

The results from the present investigation lend support w those from previous studies which have documented similar effects for rater affect (e.g., Judge & Ferrts, 1993; Tsui & Bany, 1986). It should not be surprising that feelings influence our judgements, for according to Zajonc ( 1 980) , affective reactions

are "the major currency in which social intercourse is transacted" (p. 1 53). He submits that there are very few of our perceptions and thoughts that do not contain a significant affective component. Although an idealised view of the rating process might see it as cold and objective, it is moot as to whether or not one can avoid the "primary, basic, inescapable, and irrevocable nature of affective reactions" (Zajonc, 1980) . Evidence from the present study

suggests that the appraisal of others would not be exempt.

Although it can be argued, as Zajonc ( 1980) has done, that affective reactions are ubiquitous, the question of how they influence rater evaluations, and the full extent of their influence remains unclear. Studies have shown that other factors in the rating situation may moderate the power of affect. For example, Salvemini et al. ( 1 993) found that the accuracy of raters was significantly improved by the provision of contingent rewards for accurate rating. Mero and Motowildo ( 1995) have shown that raters who are held accountable for their evaluations are more accurate than those who are not accountable. These studies suggest that the motivational context in which ratings are collected is important, and may assist raters to separate objective judgements from subjective feelings. This notion is consistent with assumptions

underlying the goal-directed model of rating behaviour, outlined by Murphy and Cleveland ( 1 995) . Forces that comprise the motivational context, such as incentives or accountability, represent rewards or punishments that influence

_,. rating outcomes.

Emphasis on the motivational context of the rating process has interesting implications in relation to reference reports. Referees do not normally have any special obligation to the organisation requesting an evaluation of

an

applicant. They are not accountable to that organisation for their ratings, and

there are unlikely to be any personal repercussions if they are inaccurate. In

other words, there are very few extraneous forces encouraging accuracy on the part of referees, and because of this, reference reports may be particularly vulnerable to the influence of rater affect. In the absence of external

pressures, raters may allow themselves to be swayed by personal feelings. Cognitive dissonance theory (Festinger, 1 954) offers one explanation for why raters may be susceptible to affective reactions in such circumstances. Cognitive dissonance theory suggests that when our attitudes. values or beliefs are in conflict with our actions we experience psychological tension or dissonance. This tension is unpleasant: therefore, we seek to minimise or eliminate it. Research has shown (Festinger & Carlsmith, 1959) that dissonance is reduced if there are obvious explanations to account for the discrepancy between our beliefs and actions. If such explanations are not available, then we may have to change our behaviour or our attitudes in order to reduce tension. For referees, diss<?nance may arise when they like the individual they are evaluating, but must rate them poorly if they are to be accurate. Likewise, dissonance may be present when referees who dislike the ratee are obliged to provide positive ratings if they are to be accurate. In the

:,.

absence of strong justifications for accurate ratings, referees may modify their evaluations so that they are consistent with personal feelings, and thereby reduce tension. The possibility that the influence of rater affect on rating

outcomes is mediated by cognitive dissonance is purely speculative, but

merits further attention.

The variable of rater affect may account for disparate findings from studies

that have evaluated the Leniency Scale. Results from the present study indicate that scores on the Leniency Scale are associated with lenient

responding by raters. Participants who scored highly on the Leniency Scale tended to be much more lenient in their ratings. These findings are consistent

with results reported by Schriesheim et al. ( 1979) and Bannister et al. ( 1 987) .

They used the Leniency Scale to control statistically for response bias in leader descriptions (Schriesheim et al.) and performance evaluations

(Bannister et al.). However, Highhouse ( 1992) has questioned the validity of

the scale. He found that the scale had poor test-retest reliability, and that

responses depended upon the specific target of the scale. Highhouse argued

that if the scale measured a lenient disposition, responses should show a greater degree of consistency, and should be relatively independent of the rating target.

A parsimonious explanation for thes.e conflicting fmdings is that the Leniency Scale represents an altematlve measure of rater affect. The Leniency Scale consists of items derived from an instrument that assessed raters' tendencies

to respond in a socially desirable manner. Respondents are obliged to answer

_.,.

true or false to items such as "No matter who he's talking to, he's always a

good listener" and "She has never shown intense dislike for anyone" (see

affectively neutral. Social desirability, by its very nature, must contain an

affective component. Furthermore, the way in which the items are worded

requires participants to make judgements about a particular stimulus person.

In the present study this was the hypothetical lecturer described in the vignette. In other studies, judgements have been based on "real life" managers, supervisors, and university instructors. In all of these studies, participants have associated their ratings with particular individuals. This represents an emotional reaction to a specific stimulus, what Murphy and

Cleveland ( 1995) have called directed affect. The end result is that the

Leniency Scale may be doing nothing more than assessing raters' affective reactions toward specific ratees. Therefore, the gains in accuracy reported by

Schriesheim et al. ( 1979) and by Bannister et al. ( 1 978) may actually be due

to the statistical control of rater affect. The significant correlation between the Leniency Scale and the .measure of affect used in the current study lends further support to such a notion. The Leniency Scale as a measure of affect

would also explain the results obtained by Highhouse ( 1 992). If rater affect is

being measured by the Leniency Scale, then one would expect scores to vary depending upon the selection of the target ratee.

A straightforward test of this proposition would be to ask respondents to

complete the scale, rating individuals towards whom they have strong positive or negative affective reactions. If the Leniency Scale is assessing rater affect,

then one would expect to see systematic changes in scores associated with the

nature, and strength, of any affective reactions. A more rigorous test would

participants directed towards a particular stimulus. Baseline leniency scores could be collected and then be examined for changes following the

manipulation of rater affect. If leniency scores for the same rating stimulus change along with affective reactions, then this would be convincing evidence that the scale does not measure a stable predisposition on the part of raters to respond leniently, but is instead a measure of rater affect.

The regression analyses conducted as part of the present study showed that scores on the Leniency Scale were related to rating outcomes. However, interpretation of the changes in R2 from the hierarchical analyses suggests that there is considerable overlap in the variance accounted for by the Leniency Scale and ratings of Likability. This is consistent with the view of the Leniency Scale as an alternative measure of rater affect. However, the Leniency Scale continued to contribute to the prediction of leniency, even after controlling for raters' liking of the ratee. This suggests that the scores from the Leniency Scale are not identical to ratings of Likability and that the scale may measure slightly different aspects of rater affect. Unfortunately, there has been very little research on how different components of affect might influence rating outcomes. Tsui and Barry ( 1 986) used a measure of affect that consisted of three elements: admiration, respect. and liking. However, because they used an overall summary measure it was not possible to draw any conclusions about the relative contribution of each element. Further complexity is introduced by Murphy and Cleveland ( 1995) who have argued that undirected affective reactions, such as mood and temperament, may also influence rating outcomes. If it is acknowledged that affective reactions can

be complex and multifaceted, then future research on rater affect should be directed toward identifying critical components of affect, and determining how they interact to influence rating outcomes.

In document Reference reports : a meta analytic review of predictive validity and an experimental study of rating accuracy : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Psychology at Massey University (Page 165-171)