Inter-observer reliability - GENERAL METHODOLOGY

CHAPTER 2: GENERAL METHODOLOGY

2.7 Inter-observer reliability

2.7.1 Secondary observer

Inter-observer reliability for identification of gesture form, content, and intentionality was measured by having a second rater code a subset of the video clips using a modified coding spreadsheet. This rater, Cat Hobaiter, was familiar with the gestural communication of gorillas and had coded many videos of captive gorilla interactions. She therefore provided an ideal second rater as she was used to the theory and methodology but was unfamiliar with the species used in the study and thus had no preconceptions of orangutan communication.

2.7.2 Design

The coding sheet used for reliability tests was modified from that used for the original coding of gestures to accommodate the rater’s unfamiliarity with both the individual orangutans included in the study and the behaviour of the species. The reliability coding aimed to focus on the elements of the videotaped actions essential to determining their status as gestures as well as their supposed meaning. The coding of the second rater, therefore, consisted primarily of variables used to determine intentionality, the supposed goal of the action, and the response of the recipient. A full list of variables coded is given in Table 4 and an example of an entry in the spreadsheet is included in Appendix III.

Appendices

Table 4: LIST OF CODED VARIABLES USED IN ANALYSIS OF INTER-OBSERVER RELIABILITY.

Variable Type Possible values

Mechanical effectiveness Scale 1-4 1) Effective 2) Possibly effective 3) Likely non-effective 4) Definitely non-effective Directedness Scale 1-4 1) No recipient

2) Several potential recipients

3) Several potential recipients but directed to one 4) One definite recipient

Goal Categorical

• Unknown • Affiliation • Attention • Play

• Share food/object (acquire object or info) • Look at object/body part (direct attention) • Stop behaviour ("no")

• Move back • Leave • Follow • Climb on • Pick up • Mate Signaller’s visual attention

Out of view & Scale 1-4

• Out of view 1) Can’t see recipient 2) Can potentially see 3) Looking towards

4) Looking at the face or eyes

Recipient’s visual attention

Out of view & Scale 1-4

• Out of view 1) Can’t see signaller 2) Can potentially see 3) Looking towards 4) Looking at the face

Modality match Categorical •• Not detectable Detectable but not necessary • Detectable and necessary

Response

waiting Scale 1-4

1) None 2) Pause

3) Wait until response 4) Wait >2 sec

Response Categorical

• No response

• Negative (look away, move away, aggression) • Acknowledge but carry on with prior behaviour • Pay attention (look or move towards)

• Positive interaction (affiliate, play, give)

Goal met? Categorical •• No Yes • Unclear Persistence Categorical • None • Repeat/elaborate • Same modality • Change modality

Appendices

Table 4 continued

Variable Type Possible values

Sequence goal Categorical •• Different Same • Unclear Outcome Categorical • None • Affiliation • Attention • Play

• Share food or object • Look at object or body part • Stop behaviour • Move back • Leave • Follow • Climb on • Pick up • Mate Intentionality rating Scale 1-4 1) Not intentional

2) Unclear/needs more evidence

3) Consistent with intentional interpretation 4) Support for intentional interpretation

2.7.3 Procedure

The second rater (CH) was trained to use the spreadsheet on a set of 15 pre- selected video clips. The primary rater (EC) analysed the 15 clips alongside the second rater, discussing why each judgment was made and working with one clip until both agreed on all the different ratings. Then the second rater was given free access to all video clips and told to code as many clips as possible within a limited period of time (two afternoon sessions). She was given no other instructions or limits except that she should include some video clips from each of the three zoos.

2.7.4 Analysis

Tests for reliability between the observations of the two observers were done using Cohen’s Kappa. This test measures the agreement between independent observers, taking into account the possibility of chance agreement.

Appendices

Some of the variables were combined into more general categories to reflect the overall nature of the interactions rather than highly specific distinctions between contexts that may require familiarity with either orangutan behaviour or the ability to contextualise the subset of clips within all clips in the dataset. Thus reactions that involved non-

aggressive social interactions were grouped together as “positive” responses, and responses that involved leaving or actively rejecting the signaller (e.g. pushing away) were combined into “negative” responses. I grouped actions that were coded as scalar values into 2 categories of high and low values for analysis. The combining of specific values for each variable is reported below.

Values for the variable mechanical effectiveness were combined into either “effective” (previously, “effective” and “probably effective”) or “non-effective” (previously “likely non-effective” and “definitely non-effective”). I condensed the category directedness by combining “one potential recipient” and “one certain recipient” into “one recipient.”

The variable goal was condensed into more general categories that reflected either attraction or repulsion of the recipient. The values “leave,” “stop,” and “move back” were combined into “stop/move away.” The values that reflected the goal of positive

interaction (“affiliation,” “attention,” “play,” and “look at body part”) were combined into “attention/play.”

The measures of gaze direction for both the signaller and recipient (signaller visual attention and recipient visual attention) were collapsed within each variable so that all values that indicated one individual could see the other became “looking towards.” Thus measures of visual attention had values of either “looking” or “not looking.”

Response waiting was initially divided up into 4 categories in order to obtain a more delicate measure of whether the signaller was demonstrating her expectation of a response from the recipient. For the purposes of this analysis, only the most extreme measure of waiting for a response (waiting for more than 2 seconds) was counted as

Appendices

response waiting. All values indicative of pauses shorter than 2 seconds were condensed into “no response waiting.”

For analysis of the variable response, the value “acknowledge but carry on with prior behaviour” was merged into the value “pay attention to.” The variable outcome was condensed using the same combination of categories as was used for goal.

The rating for intentionality was condensed so that both of the values that suggested intentionality (“consistent with intentional interpretation” and “support for intentional interpretation”) were merged into the single value “likely intentional.” This was done to reflect the inclusion of both values in building the dataset of intentional gestures.

2.7.5 Results

The second rater coded 64 video clips, yielding a total of 108 potential gestures. Nineteen of the potential gestures had to be discarded due to incomplete coding. This left 89 potential gestures (5.8% of all gestures) to use for comparison of the two raters. The kappa values for concordance between the two raters are reported in Table 5.

Appendices

Table 5: MEASURES OF AGREEMENT (COHEN’S KAPPA) BETWEEN THE TWO OBSERVERS FOR EACH OF THE 13 VARIABLES MEASURED.

Also listed is the type (scalar or categorical) for each variable. The strength of agreement signified by each kappa value (Landis and Koch 1977) is given in the right hand column.

Variable Type of variable Kappa value Strength of

agreement Mechanical

effectiveness Scalar .88 Almost perfect

Directedness Scalar .94 Almost perfect

Goal Categorical .63 Substantial

Signaller’s visual

attention Scalar .91 Almost perfect

Recipient’s visual

attention Scalar .89 Almost perfect

Modality match Categorical .78 Substantial

Response waiting Scalar .79 Substantial

Response Categorical .64 Substantial

Goal met? Categorical .48 Moderate

Persistence Categorical .80 Substantial

Sequence goal Categorical .83 Almost perfect

Outcome Categorical .56 Moderate

Intentionality rating Scalar .68 Substantial

Though the values for two variables generated only “moderate levels of

agreement,” the mean kappa value for all variables was 0.75, signifying a “substantial” strength of agreement between the two raters.

Appendices

Chapter 3: Gesture form and function

In document Gestural communication in orangutans (Pongo pygmaeus and Pongo abelii) : a cognitive approach (Page 69-75)