6 Discussion 75
6.1 Validity and reliability 75
First of all, it needs to be stressed that with such a small group, the study cannot offer definitive results of what areas of language learning games affect. Neither should they; the role of case studies is to pilot novel approaches and to raise questions for longer-term research. As such, the results can act as starting point for a discussion on the topic and whether this kind of self-reported survey approach is a feasible method for evaluating what competences are trained. For a case study such as the present thesis, discussion about validity and reliability is essential.
Validity by definition is the measurement of how well the methods measure or data represent what they are supposed to measure or represent (Newman & Benz 1998, 32). It can be further broken down3 to internal and external validity. Internal validity measures how well the researcher is able to forge a causal link between the data and the conclusions they make. External validity on the other hand is the extent to which the results of the study can be generalised to other contexts.
When it comes to evaluating the internal validity of the current study, the validity of survey items needs to be addressed first. The survey items are the first step
3 There are other categorizations such as test validity and face validity. For the purposes of the current study the most important factors can be addressed under internal and external validity.
of abstraction between the respondents and the language competences that are being measured. How well do the items represent the competences they are meant to represent? The items were primarily formulated based on the descriptions of the competences in Common European Framework of Reference but also other sources that sought to interpret the document for practitioners (Bailly et al. 2002). As such, they aim to represent the items in layman terms that were intelligible for a high school student. Some of the descriptions were multifaceted or covered very broad concepts and consequently some compromises were made. Thus, instead of encompassing all aspects of a competence, they aim to represent an instance of the competence in the specific context of the study. Retrospectively, some of the statements might have been oriented differently: for example, the choice to ask students whether they learned new grammatical structures could have been positioned to ask whether they practised existing ones. Regardless, they measure different sides of the same competence and provide valuable information as such.
Another aspect to consider is the subjectivity of the answers; the participants may not recognise when they are practising a competence despite the simple
statements in the survey. This was partly addressed by observation that sought to provide an additional point of view alongside the participants’ self-reported learning.
Secondly, any inference made from the data (i.e. not directly observed) is threatening the internal validity of the study. Both the current and the analysis-chapter above have sought to understand and explain the results in their context. This is inherent to any study that seeks to explain phenomena and is best countered by addressing rivalling explanations (Yin 2013).
External validity of a case study is an interesting question. Denscombe grants that a case study can be generalised to an extent, given that it is conducted properly
(2010, 322). Obviously any generalizations need to be done cautiously; a study that is conducted with a small sample in a specific context can offer a perspective to the research questions but hardly conclusive evidence.
In general, the choice of method can create validity issues as well. Surveys, while productive and efficient, lack the dynamic interactivity of, say, interviews. If there are interesting trends or discrepancies in the responses there is no way to investigate them further. This point was acknowledged when designing the study setting and observation data was collected to facilitate interpretation of data. In hindsight, more structured observation (cf. freeform note-taking) may have benefitted the accuracy of the study. However, observation did prove useful and provided necessary context for many of the survey items. Methods such as individual interviews might have provided broader data but given the specificity of the topic survey items were deemed more practical. Indeed, researching informal learning that is often unconscious, more open approach might have yielded less accurate data. Still, given the opportunity, follow-up interviews could have provided clarification to some of the questions left open in the analysis.
Measurement of how accurate and replicable the study is determine the reliability of the study. This is sometimes considered problematic from the point of view of case study is deeply rooted in practise and in the given context and in the given time. It need not be: considering reliability from the point of view of
replicability, if we can provide full descriptions of the procedure and the steps taken, the research setting can be repeated in other similar settings. (Yin, 2013.) Assuming case studies are admitted any level generalizability then the research settings should be transferable to other settings as well. Granted, no two settings will ever be identical when it comes to social studies, but as Riege points out “possible differences also can
provide a valuable additional source of information about cases investigated (2003, 81).