The question of how characteristics of a ratee may influence ratings has been researched extensively. Factors such as age (Cleveland & Landy, 1 98 1 ; Waldman & Avolio, 1986}, education ffsui & O'Reilly, 1989). sex (Maurer & Taylor. 1994; Pazy, 1 986}, and race (Pulakos, White, Oppler, & Borman,
1 989; Turban & Jones, 1 988) have all been examined for their potential to introduce bias into the rating process. Although a range of demographic variables have been explored, researchers addressing the issue of subgroup
bias in evaluations have been particularly interested in potential race and sex effects. These factors are amongst those most commonly identified in human rights and employment legislation as a basis for nondiscrimination (Human Rights Act, 1 993; U. S. Equal Employment Opportunity Commission, U. S. Civil Service Commission, U. S. Department of Labor, & U. S. Department of Justice, 1978) and are the variables most relevant to the present study.
Race
Early studies investigating the effects of race on evaluations have reported mixed results. Some studies have documented a bias against blacks (e.g. , Hamner, Kim, Baird, & Bigoness, 1974; Parsons & Liden, 1 984) , others a bias in favour of blacks (e.g. , Schmitt & Lappin, 1 980) , and yet others no effects whatsoever (e.g. , Schmidt & Johnson, 1973) . A meta-analytic review of ratee
race effects conducted by Kraiger and Ford ( 1 985) shed more light on the
issue and went some way toward reconciling the inconsistent findings
recounted in the literature. They reported corrected mean correlations of . 1 8 and -.22 between ratee race and ratings from white and black raters
respectively. These results indicated the presence of a same-race bias in performance ratings. That is, there was a clear tendency for white raters and black raters to assign higher ratings to members of their own racial group. Kraiger and Ford go on to report that these results were moderated by the setting in which ratings were collected and the saliency of blacks in the
sample. Race effects were found to be most likely in field settings (as opposed to laboratory-based studies) and when blacks comprised a small percentage of the workforce.
A potential limitation of the Kraiger and Ford ( 1 985) review was their inability to disentangle the influences of performance and race, a point made by
Oppler, Campbell, Pulakos, and Barman (1 992) in their discussion of
methodology. They identifY various approaches that have been utilised for the assessment of subgroup bias in performance evaluations. The first of these, the total association approach, is characteristic of many field studies
exploring bias in ratings. Typically, researchers attempt to estimate the amount of criterion variance accounted for by subgroup membership by comparing ratings given to white ratees to those given to black ratees. Unfortunately, such studies cannot distinguish between rater bias and true performance differences. Differences between subgroups can be attributed to real differences in performance, or to criterion contamination. Evidence conducive to the former explanation has been provided by Ford, Kraiger, and Schechtman (1 986) . They argue that the uniformity in effect sizes for both objective and subjective criteria found in their meta-analytic review implies that the race effects "found in subjective ratings cannot be solely attributed to rater bias" (p.334).
The second approach identified by Oppler et al. ( 1 992) for the assessment of subgroup bias is the direct effects approach. According to Oppler et al. , researchers using this approach have attempted to eliminate real differences in performance between members of different subgroups prior to any
assessment of rater bias. Consequential differences in ratings are then more clearly attributable to rater bias. Laboratory studies (which are prevalent in
this category) have done this by controlling performance levels. The performance of ratees is usually held constant, or varied independently of race (e.g. , Schmitt & Lappin, 1 980) . In field studies, true performance differences have been controlled by statistical methods. The influence of nonrating factors is removed from the ratings before comparisons are made between subgroups (e.g. , Oppler et al. , 1992; Pulakos et al. , 1 989) .
Interestingly, the results from studies employing these methodologies indicate that the effects of ratee race on performance evaluations may have been overstated. Kraiger and Ford ( 1985) estimated the corrected correlation
between race and performance ratings for the laboratory studies they reviewed to be only .03. Pulakos and colleagues (Pulakos et al. , 1 989; Oppler et al. ,
1 992) have found consistent rater and ratee race effects in the large army samples they have analysed. However, they estimate such effects account for less than 2o/o of the total criterion variance. There appears to be some
consensus in the literature that race can influence performance evaluations, but that, in general, the magnitude of such effects is small.
Sex
The literature on gender-related bias in ratings has produced inconsistent results (Nieva & Gutek, 1980) . Some studies have reported an evaluation bias in favour of females (e.g. , Hamner et al. , 1974; Mobley, 1 982; Norton,
Gustafson, & Foster, 1 977) . Other studies have reported no differences in ratings as a function of ratee sex (e.g. , Cascio & Phillips, 1 979 ; Pulakos & Wexley, 1 983; Thompson & Thompson, 1 985) while yet further studies have
reported a pro-male bias in ratings (Dipboye, Arvey, & Terpstra, 1 977; Pazy, 1986) .
These contradictory and ambiguous research findings make it difficult to draw any firm conclusions about effects in this area. However, some tentative statements do appear warranted. Firstly, unlike studies that have
investigated the effects of race, those inquiring into gender typically have found no interactions. That is, there is very little evidence for any kind of same-sex rater-ratee bias (Izraeli & Izraeli, 1 985; Mobley, 1 982; Pulakos &
Wexley, 1 983) . However, these results are complicated by recent findings from a study conducted by Tsui and O'Reilly ( 1 989) who found the
performance ratings of subordinates were affected by the degree of "relational demography" evident in superior-subordinate dyads. Increasing dissimilarity
in six superior-subordinate demographic factors (of which sex was one) was associated with poorer performance ratings. However, it must be emphasised that the effect sizes reported in their study were minimal and that sex was only one of six factors which they considered. Overall, relational demography appeared to account for only a small proportion of the variance in ratings. These findings are consistent with the results of a study conducted by Pulakos et al. ( 1 989) who found that ratings of army personnel were
influenced by the sex of the ratee, but that the amount of variance accounted for by ratee sex was less than 2%. Although a pro-male bias in ratings was evident, in practical terms the effects appeared negligible.
It has been noted that many of the studies in which sex differences in ratings have been documented were conducted in laboratory settings, and that
results from studies conducted in the field have been much less definitive (Dobbins, Cardy, & Truxillo, 1 988; Maurer & Taylor, 1 994; Pulakos et al., 1 989) . This has lead some researchers to abandon simple sex effects to consider other gender-related factors such as relational demography (Tsui &
O'Reilly, 1989) , sex-related stereotypes (Dobbins et al. , 1988) , gender-related occupational stereotypes (Bartol & Butterfield, 1976) , and perceived
masculinity and femininity (Maurer & Taylor, 1994) .
Summary
Recent meta-analytic reviews (Kraiger & Ford, 1 985; Ford et al. , 1 986) and studies using large samples (Pulakos et al. , 1 989; Oppler et al. , 1 992) have confirmed that race and sex do influence ratings, but report that the
magnitude of such effects is small. In contrast, other studies have continued to emphasise the significant consequences of sex (Pazy, 1 986) and race (Turban & Jones, 1 988) for perfonnance evaluations. Previous research has been criticised on methodological grounds (Dipboye, 1 985; Oppler et al.,
1 992; Pazy, 1986) and for the paucity of field studies (Dobbins et al. , 1 988; Oppler et al., 1 992) . However, questions remain regarding the generalizability of more recent studies and in particular, the large scale analysis of army
ratings conducted by Pulakos and associates (Oppler et al. , 1 992; Pulakos et al. , 1 989) . The small effects for race and sex they report may be due to specific characteristics of the sample and the c�ntext in which ratings were
collected. More specifically, the raters in their studies had undergone extensive training and were provided with well constructed behaviourally anchored rating scales. Two further considerations are also relevant and would appear, at least prima facie, to offer potential explanations for the minimal effects that have been observed. The first is the sizeable
representation of ethnic minorities in the U.S. army. If, as Kraiger and Ford suggest, racial saliency is a factor in biased ratings, then we would expect bias to be reduced in situations where ethnic minorities are prominent. Secondly, the United States has in place strict legal guidelines in relation to race and sex discrimination in employment situations. Employers risk stiff legal penalties should these be contravened in any way (U. S. Equal
Employment Opportunity Commission et al. , 1978) . All of these factors would appear to mitigate against bias in ratings. In situations where these extemal constraints were not present, bias in ratings might easily arise.
Recent surveys of New Zealand managers (McGregor, Thomson, & Dewe, 1 994) and women directors (Shilton, McGregor, & Tremaine, 1 996) suggest that women are under-represented in management and senior management positions. Other authors (Chen, 1 993) have highlighted anti-Asian feeling and discrimination in New Zealand. Such research alerts us to the ever present possibility of bias and discrimination in employment settings here in New Zealand. Therefore, before ruling out the prospect of sex and/or race effects in evaluations, further research in a New Zealand context is required.