The data collected for this study required both quantitative and qualitative analysis. Each Knowledge Inventory assessment was scored by the University of Pittsburgh’ Office of Measurement and Evaluation, as teacher candidates recorded their responses on Scantron© answer sheets. Descriptive data such as question means and ranges, as well as frequency distributions, were included in the analysis. Statistical comparisons were made between the MAT and PY teacher candidate groups as a whole.
Portions of the Survey of Perceptions were analyzed quantitatively as well. On various items within the survey, teacher candidates were asked to respond by circling the most appropriate answer, such as yes/no or always, sometimes, never. In these instances, the answers were recorded quantitatively and analyzed as such; descriptive data and statistical comparisons were made in a similar manner as those described above for the Knowledge Inventory assessment.
The rest of the perceptions survey was analyzed qualitatively. Categories of responses have been previously established upon construction of the survey; teacher candidate responses were analyzed and sorted into themes. Analyses of the emerging themes within each category were conducted and reported. Miles and Huberman (1984) simplify data analysis into three categories: data reduction, which refers to the “process of selecting, focusing, simplifying, abstracting and transforming the ‘raw’ data”; data display, which is “an organized assembly of information that permits conclusion drawing and action taking”; and conclusion drawing or verification (p. 22).
3.5.1 Role of the Researcher
My role in this research study was to develop and administer assessment materials and collect, analyze and describe the findings in order to answer the four guiding research questions. As a doctoral student in the Department of Instruction and Learning, I had access to many of the faculty members that worked with the MAT and PY teacher candidates; however, I was in no way directly affiliated with either program. Therefore, I was a ‘neutral’ researcher and posed no threat to the teacher candidates.
Interaction with the subjects was limited. Due to geographic constraints, I was not available to administer the Stage I assessments personally. Two doctoral students served as proctors after being trained in the administration of both assessment measures. However, there was little personal interaction with either group of teacher candidates during the administration of the Knowledge Inventory, Survey of Perceptions and follow-up survey; the phone interviews
and on-going interviews with faculty and staff will require the most interaction between myself and others.
3.5.2 Validity and Reliability
Huck (2004) asserts that “a researcher’s data are valid to the extent that the results of the measurement process are accurate” (p. 88). In other words, an assessment is valid if it measures what it purports to measure. In this study, the Knowledge Inventory was originally designed by researchers at the Florida Center for Reading Research to assess specific and general knowledge and skills about early reading instruction that were taught to K-3 Reading First teachers during 4- day Just Read, Florida! Teacher Academies. The Knowledge Inventory is a valid assessment because it measures what it set out to measure: knowledge and skills about early reading instruction. In the same regard as the Knowledge Inventory, the Survey of Perceptions was designed to draw responses from the teacher candidates that indicated, based on their experiences at the University, their perceived readiness to teach reading. There were no truly correct or incorrect answers, as this assessment was meant to be reflective. However, overall it is a valid assessment measure because, by its very design, it measures what it set out to measure.
Reliability can be paired with the term consistency. If an assessment measure is reliable, it can be used repeatedly and the results will be consistent across administrations. Huck (p. 76, 2004) points out two variations on questions of reliability that researchers often use:
o To what extent do the individual items that go together to make up a test or inventory consistently measure the same underlying characteristic?
The Knowledge Inventory and Survey of Perceptions are reliable in the extent to which the individual items on each measure the same underlying characteristic. The Knowledge Inventory was developed to assess the knowledge base of kindergarten through third grade Reading First teachers. The questions were derived from research into the crucial elements of literacy instruction; questions spanning the areas of phonemic awareness, phonics, fluency vocabulary, comprehension, assessment and instructional strategies emerged. The Survey of Perceptions was designed by this researcher and a panel of three reading experts around research on the critical elements of teacher preparation programs. Three areas were identified – coursework, field work and supportive interaction with others – and closed- and open-ended questions were developed to elicit information from teacher candidates regarding their perceptions of each.
Huck’s second statement can fit within this study if it is rephrased as “How much consistency exists among the responses provided by a group of test takers?” Technical information regarding the Knowledge Inventory collected by researchers at the Florida Center for Reading Research indicates that this assessment measure has a high degree of reliability. 105 pre-tests and 119 post-tests were administered to K-3 teachers at the Just Read, Florida! Teacher Academies; researchers were able to successfully match pre- and post-tests for 70 of these teachers (67%). The alpha reliability of the Knowledge Inventory was .80 for teachers taking it before the academy and .79 after the academy. A paired sample t-test was conducted to compare pre- and post-tests for the 70 matched exams. Results of the t-test showed a “significant increase
ins cores after the Academy, t(69) = 15.02, p <.001”22. Teachers increased their scores from the pre-test by nearly eightpoints (30% of their pre-test score). The range of scores on the pre-test was 10-40, while the range on the post-test was 20-25.