CHAPTER 3 THEORETICAL BACKGROUND AND METHODS
3.2 Methods
3.2.3 Data Analysis
Qualitative research focuses on systematic inquiry instead of mathematical abstraction. This data is collected in open–ended questions and through interview protocols provid- ing participants the opportunity to freely articulate their ideas and beliefs. In analysing
data generated in this format, responses are not grouped according to pre-defined cat- egories, rather salient categories of meaning and relationships between categories are derived from the data itself through a process of inductive reasoning. The constant comparative method offers the means whereby by the researcher may access and anal- yse these articulated perspectives so that they may be integrated in a model that seeks to explain the processes under study, in this case student perception of a QMC. The constant comparative method involves breaking down the data into discrete units and coding them to categories. Categories arising from this method generally take two forms: those that are derived from the participants language, and those that the researcher identifies as significant to the projects focus-of-inquiry [95]. Language based categories are used to reconstruct the perspective used by subjects to conceptualize their own experiences. The goal of the latter is to assist the researcher in developing theoretical insights into the processes operative in the site under study. Ultimately, the process of constant comparison leads to both descriptive and explanatory categories. This is an iterative process, and categories undergo content and definition changes as units and incidents are compared and categorized, and as understandings of the prop- erties of categories and the relationships between categories are developed and refined. By continually comparing specific incidents in the data, the researcher refines the these categories, identifies their properties, explores their relationships to one another, and integrates them into a coherent explanatory model.
All the interviews were divided into statements, which consist of a single sentence or idea. Groups of statements were assigned a topic based on the subject of the discussion in which they were located. We consider the tone of the comment as the expression of
meaning, feeling or spirit. The codes for comment tone are as follows:
+ For a comment that is positive with respect to the subject.
N for a comment that is either neutral or mixed positive and negative with respect to the subject or off-topic.
- for a comment that is negative with respect to the subject.
Interviews were also divided by interviewee group and a separate coding was performed based on the content of responses to interview questions. Statements that had similar language in response to a question or statement were built into categories. All partic- pants had their names and identifying information removed from the transcripts. A colleague who is unaffiliated with the research performed identical coding on a sample of the transcript data in an effort to qualify the consistency of the codes generated. A meeting was arranged to compare individual coding of the sample data, and differences were discussed and mitigated. When a consistency reached 95% of the sample content, these codes were considered adequate and applied to the entire body of transcript data. The first step of qualitative analysis is the implementation of the Constant Compar- ison method. The Key-Words-in-Context (KWIC) method of category development followed for qualitative analysis of the interview data [96, 97]. All the interviews were divided into statements, which consist of a single sentence or idea. Groups of state- ments were assigned a topic based on the subject of the discussion in which they were located. Interview codes were developed for both interviewee groups at the same time based on the content of responses to interview questions. Statements that had similar language in response to a question or statement were built into categories.
These categories are very straightforward for prerequisite material; necessary, a “good idea”, and not necessary. For conceptual questions, all responses were collectively grouped into common responses as codes. Identifying information was removed from the transcripts, and pseudonyms were assigned. A colleague who is unaffiliated with the research performed identical coding on a sample of the transcript data in an effort to qualify the consistency of the codes generated. Individual codings of the sample data, and differences were discussed and mitigated. When a consistency reached 95% of the sample content, these codes were considered adequate and applied to the entire body of transcript data. Data from the survey is organized by topics in physics, topics in mathematics, and QMC content. In order to compare the survey data to interview data, the topics needed to be categorized by the course were they would most likely be seen. This categorization was conducted with the assistance of a different unaffili- ated colleague who was more familiar with content of the undergraduate mathematics courses. When categories were in at least 95% agreement, the categorization was con- sidered valid. This allows a comparison of the courses that are mentioned by faculty and students in the interview data to be compared in a meaningful way to the survey data taken after students have completed the QMC. For the course content data, topics were listed by the number of times the topic was suggested to a given category. All the interviews were divided into statements, which consist of a single sentence or idea. Groups of statements were assigned a topic based on the subject of the discussion in which they were located. Interview codes were developed for both interviewee groups at the same time based on the content of responses to interview questions. Statements that had similar language in response to a question or statement were built into cat-
egories. For questions of preparation and confidence, all responses were collectively grouped into common responses as codes. All participants have had their names and identifying information removed from the transcripts. A colleague who is unaffiliated with the research performed identical coding on a sample of the transcript data in an effort to qualify the consistency of the codes generated. Ant differences were discussed and mitigated. When a consistency reached 95% of the sample content, these codes were considered adequate and applied to the entire body of transcript data.
Using the student responses and a correct solution to each problem, rubrics were gen- erated from mathematics and physics content. Each problem was also rated using a revised version of Bloom’s Taxonomy [88, 90]. Bloom’s Taxonomy was meant to be used as a common categorization to determine the cognitive goals of an exercise, cur- riculum, or sequence of courses [89]. This allows analysis of frequency of individual rubric points and math and physics oriented scores on each problem. Student notes were also collected in an effort to determine the amount of coverage various topics had over the course of the semester. Anecdotal evidence suggested that the course material had not changed appreciably in the past 5 years. To validate this claim, three sets of notes were compared for pacing, and content, and the contents is identical for 89% of topics in the notes. The pacing was the same within one week for all three sets of notes. This was deemed acceptable continuity between instances of the QMC.
Research associated with the QMCS suggests that it is best used at a junior level and that there is little conceptual knowledge gained in senior-level or graduate QMCs [66, 86]. This data motivated the use of the QMCS as a pre-instruction diagnostic for the QMC and a post-instruction diagnostic for the modern physics course at the LRSU
in the southeast. Other such tools have been created, but are to specific in content or ask questions at a level that is not expected of students entering the QMC [67]. Classical test theory (CTT) was used to determine the reliability and validity of using the diagnostic to evaluate student conceptual knowledge after taking Modern Physics I and before Quantum Mechanics. The QMCS was considered a strong diagnostic tool for the population of students that had completed junior–level quantum mechanics material, but had not yet begun a senior–level quantum mechanics.
There are four metrics commonly used in PER and other fields to determine the va- lidity and reliability of a diagnostic tool; item difficulty, item discrimination, point biserial coefficient, and Ferguson’s delta. The expressions for these metrics are shown in equations 3.1–3.4. P = N1 N (3.1) D= NH −N NL 4 (3.2) rpbi = X1−X0 σX ∗pP ∗(1−P) (3.3) δ= N 2−Σf2 i N2− N2 K+1 (3.4)
The difficulty ((P)) of a question should practically range between 0.3 and 0.9, de- pending on what fraction of the students tested are expected to answer correctly [98].
Discrimination (D) is the measure of how well as question discriminates a high achiev-
ing student from a low achieving student. This is accomplished by the number of correct responses in the upper and lower quartile of the population divided by a quar-
ter of the population. The point biserial coefficient (rpbi) is a measure of individual
item reliability. This can be interpreted as how well success on the item will predict
success on the entire test. Finally, Ferguson’s Delta (δ) is a measure of reliability of the
entire test. This is accomplished by determining how broadly scores are distributed over the range of possible scores on the test [98].