Validity and reliability within the mixed methods design

Mertens (2005) urges researchers to establish ways to evaluate the quality of their research in terms of its credibility and trustworthiness. Because this is a mixed methods study, standards of reliability, validity and objectivity relevant for a quantitative method need to be considered together with standards that are based on the interpretive paradigm of dependability, credibility and confirmability. In addition, Creswell and Plano Clark (2007) identify potential threats to validity of sequential designs in mixed method research.

Firstly issues connected with reliability, validity and objectivity of the quantitative data collection will be examined. This is followed by a discussion of parallel issues in the qualitative design. Finally, these issues will be considered in relation to the mixed methods design.

Quantitative issues Reliability

Reliability refers to the consistency of the instrument and is used to evaluate unsystematic errors that can arise from within the participant, from the conditions of the administration, or from changes in the measurement instrument (Mertens, 2005). In order to give a measure of reliability, the descriptive statistics of mean and standard deviation have been calculated for each item. For each scale, a Cronbach‟s coefficient Alpha statistic has been calculated. Mertens (2005) indicates that an alpha ranging from 0.75 to 0.95 is acceptable and encourages researchers to discuss possible sources of error. Most items were in this range. In the adapted R-SPQ-2f (Biggs et al., 2001), the values of the Alpha coefficient ranged between 0.79 to .86 for the approaches to learning scales for 10 items and .59-.77 for the five items on the motive and strategy subscales. The adapted PALS was lower with values ranging from 0.57 for three items to 0.81 for five items. This compared to the alpha coefficients for the original PALS which ranged from .70 to .89.

In this case, possible sources of error could have arisen from two areas. One is the participants‟ understanding of the items. Although the items were based on an existing survey and reviewed and trialled, it is possible that misunderstandings of what was being asked could have occurred because the items were in English which is not the participants‟ first language. Participants may have had different individual understandings of what was meant by an essay question. In addition, some of the participants noted in correspondence after the survey that some of the items seemed repetitive. Although the whole survey did not take more than 20

minutes to complete when trialled, it had a large number of items and some participants may have found these tedious. This may have affected the accuracy and reliably of their responses.

Another possible source of error may have arisen from the different timeframes from when the first participants did the survey until the final participants completed the survey. These questions were concerned with accessing attitudes to motivation and assessment. Because the survey was administered in the period leading up to and during the examination period, there may be some variation in the way students viewed motivation and assessment. If the same survey is administered to a student in the period leading up to an examination versus during the period immediately following the examination, feelings such as anticipation or relief might blur self- reports of study habits and motivation.

Validity

Confirmatory factor analysis can provide evidence for construct validity (Mertens, 2005). While the sample in this study was too small to successfully perform confirmatory factor analysis, both of the survey instruments were based on existing, validated instruments. Hence the procedure for validating each instrument will be discussed.

In the case of the section of the survey based on PALS (2000). Midgley et al. (2000) reported that the scales were based on a framework arising from previous research (Ames, 1992; Elliot & Harachiewicz, 1996; Skaalvik, 1997, cited in Midgley et al., 2000, p. 2). Validation of the version of PALS was conducted on the subscales of personal goal structure and the classroom goal structure using confirmatory factor analysis (Middleton & Midgley, 1996) and further validation was undertaken of this version of PALS (Midgley et al., 2000).

In the case of the second part of the survey, items were based on the revised version of the R-SPQ-2f (Biggs et al., 2001), initially developed by Biggs, (1987). This version was developed by Biggs et al. (2001). The questionnaire was tested and refined with 229 Hong Kong university students and then tested on 495 undergraduate students in Hong Kong using confirmatory factor analysis and revealed “that the final version of the testing had very good psychometric properties” (Biggs et al., 2001, p. 145).

While the validation of the original instruments provides some support for the validity of the adapted questionnaires use in the survey, it is important to recognise that that

the lack of validation of the adapted questionnaires contributes to the limitations of this study. However, Professor Kember‟s comment supports the view that this version of the SPQ (Biggs et al., 2001) is suitable with this group of participants and supports the argument for the validity of the survey:

It is ideal for what you want. It would be better than the old version of the SPQ or instruments based on the ASI because the R-SPQ-2f was modified to take into account what has been learnt about the Chinese learner and approaches to learning in the past few years. (D. Kember, personal communication, March 9 2007)

Objectivity

Mertens (2005) states that objectivity is determined by the judgement of the person who administers, scores and interprets a test. In this case, the survey was administered from a website and this allowed a uniform means of administration. The set up of the website remained unchanged for the duration of the survey. Items were set-up in such as way as to generate ordinal data that fed directly into SPSS through an EXCEL spreadsheet. All other parts of the survey with the exception of the open- ended questions generated categories through the use of drop-down menus.

Qualitative issues

Credibility

Mertens (2005) lists means by which the credibility of qualitative research may be evaluated. The first of these is prolonged and substantial engagement with the data, the participants and the context. In this study, the qualitative data were gathered over a calendar year. The length of this period enabled cycles of data gathering to be interspersed with coding and memoing. I have sought ways of engaging with the community of Chinese students at the university through social functions and more formal structures such as the Chinese Students‟ Association. During this period I have paid close attention to background issues in their home countries by reading of the print and electronic media. An example of such an issue was graduate unemployment that may provide insights into the data analysis. Peer debriefing is another means listed by Mertens (2005) as a way of improving the credibility of qualitative research data. Discussions with my cultural advisors of the data strengthened the credibility of the interpretations of the data.

Confirmability

Data were stored, coded and analysed using NVivo 8. This allowed data to be easily traced through codes and categories to the original sources. The use of a case book enabled ease of access to background information about each participant and the use of the query function to sort the data and question them.

Member checking was carried out when the data were returned to the interview participants for comments. Each participant received his or her interview transcript as an attachment to an email. In my email I thanked the participants for their time and invited them to review their transcript and make any changes that they wished. I also asked further questions and checked out emerging themes with them. An example of how this was done is shown below:

Hi XXX,

Thank you for your time last week. I have written the transcript of our conversation (with a few typos too maybe!) for you to have a look at. If you do not get back to me by next Thursday, I will assume that it is okay. I was really interested in your wide definition of “achievement” and it was very helpful for me to see how study is part of your life. Your story of coming to NZ through Malaysia is rather unique and interesting too. I have one further question for you. Do you do much interacting with your friends on line or do you usually meet with them face to face (or both ways)? I hope you are having a good week and you had a nice break over Easter. Carolyn

(email 27/03/08)

Four of the participants continued to correspond with me during the process of data analysis.

While member checking is recommended for qualitative data (O‟Neil et al., 2007), there are issues in this process that pertain to this study. Schwandt (2007) debates how member checking contributes to the confirmability of a qualitative study. He argues that member checking does not necessarily mininise researcher effects or enable the researcher to stand apart from the process of gathering the data. While I have reflected on my positionality during in the data gathering process, this issue is especially relevant during the process of member checking. My position as a lecturer in the Faculty of Education was highlighted for the participants during the process of member checking by symbols such as my staff email address and my automatic formal signature on my emails. The status which provided me access to the participants, at the same time, may have compromised the participants‟ responses to member checks. In addition, my discourse reflected my age, education position and language background. These are particularly relevant when the participants come from a culture where age and education give status. The action of one participant who subsequently asked me to provide feedback on one of her academic essays could be seen as evidence of positionality. However, despite the

limitations of my positionality, member checking enabled this research to be “a more participative and dialogic undertaking” (Schwandt, 2007, p. 188).

Mixed methods designs

Validity or inference quality in a mixed methods study is defined by Creswell and Plano Clark (2007) as “the ability of the researcher to draw meaningful and accurate conclusions from all the data in the study” (p. 146). They identify specific threats in the data collection and data analysis stages of triangulation studies using mixed methods. Each of these will be discussed next to evaluate triangulation validity in this study.

In the area of data collection, mixed methods researchers should consider issues relating to the population that the two types of data come from, relative sample sizes, the role of contradictory results, and the introduction of potential biases (Creswell & Plano Clark, 2007). The quantitative and qualitative samples were from the same population to reduce threats to validity. In this study, the participants in the qualitative sample were volunteers selected from those who completed the survey and so have come from the same population. Creswell and Plano Clark recommend that quantitative sample sizes are large and the qualitative sample is small in a sequential mixed methods study. In this study, the qualitative participants were selected using principles of theoretical sampling. This process enabled sampling to be progressively refined for developing the dimensions of categories as required by the principles of grounded theory. However, the qualitative data that came from the open-ended questions in the survey came from an equal number of participants. It is possible that there is potential bias through data collection techniques in the case of the survey. Participation was voluntary, and although all students within the sampling frame had the potential to participate, it is possible that those who decided to participate were motivated to do so because they held strong views. While every effort has been made to minimise potential threats to validity in the data collection process, it is necessary to be transparent about method and extent to which this has been able to be done in this study.

In the area of data analysis for sequential mixed methods studies, Creswell and Plano Clark (2007) recommend choosing significant results to follow up and address separately issues of qualitative and quantitative validity. In this study, qualitative data from the interviews were analysed to produce categories that captured significant themes. These categories and their relationships were represented in the form of a diagram. Qualitative data from the survey were transformed into quantitative data by coding and integrated into the analysis. The qualitative data from the interviews enabled terms such as “study strategies” to be constructed by the participants as they reported on their own actions.

Ethics

Ethical approval for the research was obtained from the Victoria University of Wellington Faculty of Education Ethics Committee before the study commenced. No deception was involved and informed consent was obtained from each participant. In the case of the on-line survey, the opening page was an information sheet where completion of the survey indicated consent (Appendix E). Therefore, those students who did not participate in all parts of the survey, including providing background information, were not included in the final group of participants. The survey was confidential with an option of participants providing their email addresses for feedback. All participants were allocated a code to preserve confidentiality. Some participants volunteered for interviews and those selected were given an additional information sheet. The purpose of the study was explained to them and they had the opportunity to ask questions. The participants had access to the interview questions at this stage. All participants who were interviewed signed a separate consent form. Participants were also assured that their decision to participate and the information that they provided would not affect their grades. Although I am a staff member of the university, I was not involved in teaching or assessing any of the participants nor had I had any contact with the participants as a staff member prior to commencing the study. The data were stored in a password protected file on my computer. The files on the digital recorder were deleted immediately after transcription. The cultural advisors also signed confidentiality agreements.

Chapter summary

In this chapter, I have discussed the issues that flowed from the process of identifying the beliefs that underlay the choice of methodology. A pragmatic approach allowed for mixed methods to be used although both methods are underpinned by constructivist beliefs. Within this view, both the data from the grounded theory study and the survey revealed the process of studying in New Zealand according to how this experience is seen by the Chinese students. I examined how my own life experience and that of my cultural advisors had impacted on the study. A rationale for the selection of a sequential explanatory mixed methods study was presented. Each method was then considered in detail including selection of the instruments. The processes of the grounded theory study was described and supported with extracts from memos and my research diary. The issues that arose during the integration of the methods was described. Finally, matters relating to validity, reliability and ethics were considered. This leads to the findings which are presented in the next chapter.

Chapter Four

In document Testing Times: The Impact of Chinese Undergraduate Students' Perceptions of Two Examination Formats on Their Motivation, Study Strategies, and Approaches to Learning (Page 80-86)