Chapter 4. Overall Methodology
4.4 Ethical Considerations
This section discusses the ethical considerations surrounding this research. It starts by considering how to collect and handle the data for this research in an ethical manner, as well as protecting participants, the researcher and any other related parties. Next this section deals with the wider ethical implications of Native Language Identification (NLID) research. The Linguistic Society of America breaks down the ethical responsibility into five main areas, responsibility to individual research participants, to communities, to students and colleagues, to scholarship and finally to the public (Linguistic Society of America, 2009). This research seeks to honour its responsibility to each and every one of these areas.
Ethical considerations will be discussed with reference to the blog corpora used for Study One and Study Two, and the Questionnaire data used for Study Three. There are some over- arching approaches that will be applied throughout this research: all data have been collected and handled in accordance with Aston University School of Languages and Social Sciences Policy on Research Ethics (Aston University Ethics Committee, 2007) and under guidance from my supervisor. The specific ethical considerations for the disguise data are discussed in Chapter 8. The overarching ethical principles remain the same as outlined here, however, the practical aspects of this are discussed more thoroughly in Section 8.2.
Ethical considerations are considerably more complicated in relation to online data, as opposed to the offline world (Gao, Kong, & Sar, 2010); the increased use of the internet in research has made this an even greater area of concern (Bassett & Riordan, 2002). All of the data from the internet that has been used in this project has been taken from public forums. While some other forums online require registration to access them, the data for this project has only been taken from blogs that were open access at the time of collection. In other words, all the data is taken from a totally public sphere and could have been accessed by any one, without registering in any way. It was decided that the requirement to register for a site, to any extent, could indicate a joining of the community at some level and texts published within the community are intended for members of the community. Registering in order to access data could be construed by some members as breaking the code of the social group, therefore it would be appropriate to seek permission and informed consent, which would considerably complicate the methodology of data collection.
The Association of Internet Researchers (AoIR) state that “the greater the vulnerability of the author – the greater the obligation of the research to protect the author.” (AoIR ethic working committee & Ess, 2002, p. 5). They also raise the debate of webpages created by
60
minors, and how this affects informed consent. In relation to this research, it is difficult to know the age of the authors, as they rarely state it. The content of the blogs suggest that the majority of the participants are adults. One could question whether in an online context, vulnerability could include a limited technological awareness i.e. a lack of understanding that something is publically accessible. However, it could be argued that a degree of technological awareness is required to set up a blog, and the concept of a blog being a form of publication is more transparent than it is in a mixed mode context such as Facebook, Myspace or Twitter.
Basset and Riordan (2002) proposed an alternative view of the ethics of internet research, arguing that the commonly employed human subjects research model is not completely applicable in the online sphere and that we should respect the cultural phenomenon of the online texts. They highlight that there “are issues and rights at stake in these debates other than those of privacy and safety. The internet user is also entitled to a degree of representation and publication in the public domain” (Bassett & Riordan, 2002, p. 244). This is very relevant to the current data, as it is all publically accessible at the time of collection. The authors have chosen to create blogs that are publically available, and while it is important to protect the ethical considerations of research participants perhaps that includes respecting the authors’ desire to be represented publically. It should be noted that this research does not intend to comment on the authors’ views, beliefs or the opinions expressed within the text. The only focus of this research is the language employed by the authors and their linguistic backgrounds.
The forensic perspective of this research means that there are ethical considerations not just in the realm of collection of data, but also with consideration to the application of this research. Sorell (2011) warns that “university based researchers interested in radicalization may appear to some members of some communities to belong as much to the establishment apparatus as judges” (Sorell, 2011). This is particularly relevant, due to the potential intelligence applications of Native Language Identification. NLID could potentially be misinterpreted as a tool for persecution or judgement, recognising and acknowledging the limitations of this research as well as strenuous focus on unbiased “objective scientific evidence” (Linguistic Society of America Executive Committee, 2011, p. 1), should mitigate this. Conley and Peterson highlighted that a social scientist and expert consultant should be aware “that expertise may have grave consequences for one or more of the litigants, and may also have a significant effect on society itself” (Conley & Peterson, 1996). The
61
methodology for NLID set out within this research is intended to be considered in relation to the potential consequences and effects. All texts and future reports are (and will be) treated with as much ethical consideration as possible, all stages of the project adhere to the Aston University code of ethical conduct, guidance from my supervisor, the Linguistic Society of America’s Code of Ethics for Linguistics in Forensic Linguistics Consulting, and my own moral and ethical code. It is impossible to control completely how the findings of this research will be used. It is feasible that a person or organisation might try and use methodology or findings discussed in this research to persecute individuals or minorities, however, this would entail a wilful misinterpretation of the methods, potential conclusions and limitations of this study.
62