Reliability and validity

CHAPTER 2: STUDY DESIGN & METHODS

2.5. Reliability and validity

Validity in qualitative research is not so much a statistical process as a process of accurately capturing reality as perceived by participants. Thus, I achieve validity by implementing policies to ensure that the process maximizes my ability to perceive that reality. I adopted a checklist by Maxwell (2005) of strategies to maximize validity in qualitative work. This required me to be intensively involved in my field sites; I spent many hours interacting with the sites and the educators outside of the programs. I also collected “rich” data by transcribing many of my tapes, solicited respondent validation from key informants and educators, focused on discrepant

evidence/negative cases, triangulated data on the institution, produced quasi-statistics (simple way to detect prevalence), and of course, compared sites. I also follow practices outlined by Yin to maximize validity in case study work by using multiple sources of evidence to create a chain of evidence (construct validity), using multiple cases as replicating logic (external validity), and ensuring replicability by documenting/systematizing my process as much as possible (Yin 2009). Qualitative research calls for a sufficiently detailed description of the methods, data, and biases inherent to research, so that validity can be assured (GAO 1996).

Post-normal science also acknowledges the researcher as a participant in the research, thus I assume my presence has influence on the study system. Throughout my study, I remained aware of and actively disclosed my biases, personal limitations, and investment in the subject matter (Dwyer and Limb 2001). I did this by keeping a journal about my own reactions to the data collection and analysis process and by disclosing my own experience as an environmental educator and CTR animal care volunteer during interviews and LAIEs in which the topic arose.

For the survey, I engaged in several practices to maximize reliability and validity. First, I aimed for 100% participation in the survey during my study period, which minimized sampling error. In order to achieve high levels of participation, I incentivized learners with a raffle for a membership at each site; this helped capture the participants who were interested in research and those with different motivations. The comparative nature of my study also helped correct for sampling biases, as theoretically the same challenges existed at all three sites.

I considered potential method biases when designing the survey. These included common rater effects (artificial covariance between two questions because the respondent is the same person), item characteristic effects (when people select an item because the properties of the item, such as its social desirability), item context effects (interpreting an item based on the other items present on the survey), and measurement context effects (asking questions at the same time/location/conditions) (Podsakoff et al. 2003). I corrected for these issues the best I could by using validated scales from the literature, as they have been tested on larger populations for construct validity. I also used scales from different parts of the literature for each VBN variable, because previous work has shown that while previous research has used the compatibility

principle (i.e. where the subject for each section is identical) to create a logical chain along VBN, that common rater effects and measurement context effects bias those studies (Kaiser et al.

2007). This means using diverse constructs representing different aspects of the VBN chain is preferable. Most importantly, using a comparative lens helps correct for these issues, as issues will be similar across sites. Shadish et al (2002) provide an exhaustive list of threats to internal, external, construct, and conclusion validity, and I used those as well as the list by Podsakoff et al. in the design of my survey instrument.

After the preliminary research, I conducted two focus groups in order to better understand respondent interpretations of each question. The focus groups were conducted mostly to work through the survey content, but we also discussed environmental values and beliefs generally, as small groups provide a useful medium in which to explore values (Burgess, Limb and Harrison 1988a). These focus groups helped me better understand how the survey metrics align with participant reality. Through the groups, I found that there is enough nuance in an individual’s values, beliefs and norms, that the scales must be considered to be useful but imperfect measures of VBN. Between the first and second focus group, I also ran a pre-test for my instrument at two of the three sites (n=31) to ensure that the survey was capturing variation and that it was feasible to complete before programs at each site.

Interviews and surveys together represented the ideal way for me to detect emergent themes and to understand the distribution of visitor characteristics across three sites. Because interviews allow subjects to recall and construct their own history and meaning (Madison 2005), interviews uncover the nuances and depth of the visitors’ environmental histories. By administering the same survey at all three field sites (see Appendix E), I can systematically compare visitor populations across sites for both demographic characteristics and for environmental values, beliefs, and norms using metrics from the literature. While these two types of data collection

reinforce each other, I limit comparisons across them, as they each help me understand different aspects of the learners at each site.

CHAPTER 3: THE INSTITUTION

In document Caplow_unc_0153D_14881.pdf (Page 57-61)

CHAPTER 2: STUDY DESIGN &amp; METHODS

2.5. Reliability and validity

CHAPTER 2: STUDY DESIGN & METHODS