"There is no general agreement as to what extent – if at all – the psychological make- ups of the two sexes are different by nature, but there is no doubt that gender is represented through a person's choices of lexical and functional items" (Dam, 2014: 87- 88). In determining the role of personal pronouns in the construction of female identity, Dam (2014) highlights the use of pronouns ‘we’, a deictic or dependent upon context pronoun, ‘you,' and ‘your,' as contributing to identity.
Hancock et al. (2014) suggest that recent research on gender differences in language present a diminishing picture. In their study (Hancock et al., 2014) of whether language use can predict perceptions of gender and femininity, 10 males, 12 females, and 13 transgendered women (male-to-female) speakers were rated against femininity. What the study highlighted was that 4 of the 14 variables differentiated males from females using T-units (known as the minimal terminable unit, where each T-Unit “is the shortest units into which a piece of discourse can be cut without leaving any sentence fragments and would contain one independent clause and its dependent clauses” (Hunt, 1965: p.189) and in Hancock et al. (2014) it focussed on dependent clauses and personal pronouns). Hancock et al. (2014) found that transgendered females tended to be more distinct from the females than the men were. In the second part of the study where they compared the use of language, the transgendered women's scores were not significantly different from the men, but none of the 14 individual variables alone were robust predictors of perceived gender or femininity. In a gender study, Lenard (2014), analysed 204 random speeches from the 113th United States Congress, split equally by gender with the LIWC (Linguistic Inquiry and Word Count) tool from Pennebaker et al. (2001). Lenard (2014) used US Congress speeches
and analysed 70 language categories and highlighted that women's personal pronoun use was larger than men and they used the word 'we' more. Men used nouns, articles, and numbers more and tended to use the pronoun 'I' more.
What is clear is from a literature search for further authorship identification techniques in the area of gender, that pronouns are a crucial part of speech in determining this. 2.2.1 Key Pronoun Studies
Flekova and Gurevych (2013), identified people by age and gender on social media websites using a combination of features, which included pronoun ratio for age, but
ignored pronouns to determine gender, and achieved a gender match of 0.58. Rangel and Rosso (2013) drew from the work of Pennebaker (2011) and used a Support Vector Machine method to categorize Parts Of Speech (POS), but this performed better at identifying age than gender. They achieved a gender match of 0.57 when including singular and plural pronouns into their POS list. Argamon et al. (2009) used a combination of pronouns (I, me, him, my) to identify females, but achieved 5-10% better success rates when they also considered content words about technology (males) and personal life and experiences (female). They achieved a gender match of 0.76 when looking at style and content feature sets. In Argamon et al. (2007), a study of how writing topic and style vary with age and gender of the blogger highlighted similar gender results to the Argamon et al. (2003) study. That Articles and Prepositions are used significantly more by male bloggers, while personal pronouns, conjunctions, and auxiliary verbs are used significantly more by female bloggers.
From the literature, much of the work that includes gender has focussed on determining age and looking at author emotion. In all cases, the focus of determining gender was to categorise people as male or female, and this was done using many different techniques with varying success. The closest aligned work was from Argamon et al. (2009), and this was the study with the highest success rate of gender matching at 76% from female use of the pronouns ‘I,' ‘me,' ‘him,' and ‘my.' While this is not as high as the 80% reached from Argamon et al. (2003), it demonstrates a reduction of the key contributing pronouns for gender identification and the power of personal pronouns. In the Kernot (2013) study, a gender success rate of 90% was reached using three pronouns, ‘my,' ‘her,' and ‘its.' A score of 93.3% was achieved using five pronouns, ‘my,' ‘her,' ‘its,' ‘themselves,' and ‘them,' but the underlying statistical results were not as significant, or reliable. Cheng et al. (2011) make the distinction between biological sex as male and female, and a person's gender-related language as a socially constructed aspect where not all men are masculine and not all women are feminine, but internally they can be other than their biological sex.
2.2.2 Summary
What is important here is, outside of the categorisation of people into purely male and female, the existing work in Kernot (2013) provides an opportunity to describe a person on a continuum between 0 and 1 from their use of personal pronouns for identification. At one end, they are likely to be female, and at the other male, but the scores from the gender equation (see Equation 2) provide an opportunity to identify
the internal, socially constructed gender of a person. Another important point is the power of a pronoun to relate to self closely, and this point is researched further and discussed next in Referential Activity.