Chapter 3 RESEARCH DESIGN
3.6 Data reliability issues
Reliability in qualitative research is a contentious topic (Richards, 2005, pp. 98-99) and frequently debated among the community of qualitative scholars (Thomas & Magilvy, 2011). Code reliability, rigor, replicability, level of generalization, or standardization –
all synonymous terms – are ways to establish trust and confidence during the data collection, data analysis and the interpretation and findings of a research study. In other words it establishes consistency of the study method over time and facilitates the replication of the study using a different research sample (Thomas & Magilvy, 2011, p. 151).
Building on the work of Lincoln & Guba (1985) on how to establish trustworthiness in qualitative research, Thomas & Magilvy (2011, pp. 152-154) address four components of trustworthiness relevant to qualitative research:
1. Truth-value (credibility): credibility occurs when the study presents an accurate interpretation of human experiences that is also shared among other people. To check for credibility a common strategy is member checking or informant feedback. The researcher returns to the source of the data and seeks confirmation that her interpretations of categories and themes are recognised by the informant as accurate representations of her experiences.
2. Transferability: it signifies the ability to transfer the research findings in other contexts or with other subjects or participants.
3. Dependability: this component ensures that another researcher can follow the decision trail of the researcher in regards to the purpose of the study, how the case study was selected, how data were collected, how data were reduced or transformed for analysis, and how they got interpreted. Among strategies to check for
dependability is to provide a detailed description of the research methods. 4. Confirmability: this component exists when the previous three components have
73 During the analysis and coding phase I used my own judgment to code the data against the various concepts. I looked in every sentence and every paragraph and tried to identify keywords that would describe one or more of the concepts. I also looked at the context of a specific sentence or block of sentences and code it accordingly. Despite the use of keywords or making sense through the context of the data read, this analytic process can introduce subjectivity and errors into the coding. This is due to a number of reasons; sometimes the understanding of the categories change over time, or the
colleagues see the data differently, especially if they come from different disciplines (Richards, 2005, p. 99). Also, Creswell (2003, p. 195) states that only in a limited way qualitative researchers can use reliability to check for consistent patterns or themes development among several investigators. This issue well known among academics as ‘Intercoder reliability’ – which is a measure of agreement among multiple coders on how they apply codes to text data – also produced ambiguous results when an attempt was made to codify samples of my data from two independent researchers.
Considering all these challenges Lincoln & Guba (1985, p. 290)pose the following question:
‘How can an inquirer persuade his or her audiences (including self) that the findings of an inquiry are worth paying attention to, worth taking account of?’
Standardization of the research situation and making it independent from the single researcher, who observed, interviewed or did the experiment in the first place, is then the via regia to research quality (Flick, 2007, p. 61).
Another way of increasing validity in qualitative studies is through the use of
triangulation approach as demonstrated by Jonsen and Jehn (2009). Their focus is on the triangulation of the data analysis rather than the data gathering stage. Coding levels were used to categorise the data, however the transition between coding levels poses questions such as how or when authors move from one coding level to the next. According to the authors, the answer lies in the “drugless trip”, a period in the coding process where the researcher moves among a maze of codes and tries to make sense of the data. Although one can analyse each code to death, the development of a set of distinct yet related core categories is a mind-trip, very hard to share (Jonsen & Jehn, 2009, p. 128). This is why the authors call for the use of complimentary triangulation methods to capture the ‘right’ concepts and develop the ‘right’ model. A triangulation
74 technique mentioned previously is the use of member checking (Flick, 2007, p. 66) or informant feedback to triangulate framework. Thus the informant acts as a ‘judge’ evaluating the major findings of the study. However even this strategy must be used with caution (Lofland et al., 2006, p. 94) as the ‘facts of social life are socially
embedded artefacts, and the researcher’s understanding of the data requires that they be accurately placed within the subjective and intersubjective contexts that make them meaningful. One of the great advantages of participant observation is that it enables the researcher to contextualize observations since they are witnessed in close proximity to informant’s experiences ... and the “facts” take on significantly different meanings when placed in different social contexts’ (p. 94)
Creswell (2003, p. 196) proposes eight strategies available to check the accuracy of the findings: (1) triangulate, (2) member-checking, (3) use rich thick descriptions, (4) clarify researcher’s bias, (5) present negative or discrepant information, (6) spend prolonged time (7) use peer debriefing and (8) use an external auditor. The present study is in consonance with the above researchers and validates the findings through the use of the following strategies:
Triangulation of different data sources of information to build a coherent
justification of themes. For example, data were collected from direct observations during meetings, from unstructured interviews and from document analysis.
Member-checking to determine the validity of the qualitative findings. This is when data, analytic categories, interpretations and conclusions are tested with members of those groups from whom the data were originally obtained. To do that the present author used a WYM informant (John) to check the findings and determine whether they are accurate. That way the participants understand better what the researcher is trying to do, give them chance to correct any errors or challenge interpretations, which often can result in additional information being recorded.
Use rich thick descriptions to convey the findings. The findings were conveyed through a number of illustrative examples from the workplace which strengthen their accuracy. Thick description is described by Lincoln & Guba (1985) as a way of achieving a type of external validity. By describing a phenomenon in sufficient detail one can begin to evaluate the extent to which the conclusions drawn are transferable to other times, settings, situations and people.
75
Spend prolonged time in the field. As previously mentioned the present author became embedded and treated the data collections phase as a ‘full time’ job, which gives the feeling that one has been there and had a clear picture of the events as they unfolded.
Whilst I had undertaken all of these measures in my fieldwork and analysis, there remained questions about my introducing bias through NVivo (which is essentially a sophisticated indexing system which aids the researcher, but cannot control bias). Because of this uncertainty, following a lengthy debate with my supervisors and other colleagues, I took advice from an independent expert in qualitative data analysis (Prof Louise Young, University of Western Sydney), and decided to revisit the analysis phase and use a different approach that would largely remove bias out and increase reliability in the findings.
The structure and presentation of the findings from the two systems are also markedly different. NVivo essentially allows the researcher to index data and infer relationships amongst concepts identified by the researcher – that is the researcher “speaks” for the data, helped along by tools that interrogate the database. On the other hand, Leximancer allows the data to “speak” purely based on patterns that the software itself finds or based upon theoretically grounded guidance from the researcher (which was my choice). Leximancer also generates much more detail on the relationships between the concepts that it generates, and allows the researcher to vary the level of detail that is reported. In short, Leximancer produces analysis that is systematic, comprehensive and unbiased.