PROCESS-ORIENTED AND PRODUCT-ORIENTED ASSESSMENT OF EXPERIMENTAL SKILLS IN PHYSICS: A COMPARISON

(1)

PROCESS-ORIENTED AND PRODUCT-ORIENTED

ASSESSMENT OF EXPERIMENTAL SKILLS IN PHYSICS:

A COMPARISON

Nico Schreiber1, Heike Theyßen1 and Horst Schecker2

1

University of Duisburg-Essen 2

University of Bremen

Abstract: The acquisition of experimental skills is widely regarded as an important part of sci-ence education. Models describing experimental skills usually distinguish between three dimen-sions of an experiment: preparation, performance and data evaluation. Valid assessment proce-dures for experimental skills have to consider all these three dimensions. Hands-on tests can es-pecially account for the performance dimension. However, in large-scale assessments the analysis of students’ performances is usually only based on the products of the experiments. But does this test format sufficiently account for a student’s ability to carry out experiments? A

process-oriented analysis that considers the quality of students’ actions, e.g. while setting up an experi-ment or measuring, provides a broader basis for assessexperi-ments. At the same time it is more time-consuming. In our study we compared a process-oriented and a product-oriented analysis of hands-on tests. Results show medium correlations between both analysis methods in the perfor-mance dimension and rather high correlations in the dimensions preparation and data evaluation.

Keywords: experimental skills, assessment, process-oriented analysis, product-oriented analysis, science education

BACKGROUND AND FRAMEWORK

The acquisition of experimental skills is widely regarded as an important part of science educa-tion (e.g. AAAS 1993, NRC 2012). Thus, there is a demand for assessment tools that allow for a valid measurement of experimental skills. In our study we compared a process-oriented and a product-oriented analysis of students’ performances in hands-on tests. The test instrument refers to a specific model of experimental skills.

Modelling experimental skills

In the literature there is a broad consensus concerning typical experimental skills, like “create an experimental design”, “set up an experiment”, “observe / measure” and “interpret the results” (e.g. NRC 2012, DEE 1999). These skills can be assigned to three dimensions of an experimental investigation: “preparation”, “performance” and “data evaluation”. Most models of experimental skills are structured along these three dimensions with different accentuations (Klahr & Dunbar 1988, Hammann 2004, Mayer 2007, Emden & Sumfleth 2012).

Our model uses the three dimensions, too. In contrast to other models, it accentuates the perform-ance dimension (Figure 1 adapted from Schreiber, Theyßen, & Schecker 2009).

(2)

At school, an experimental question is usually given to the students and not developed by them. So students have to interpret and clarify the given task. In non-cookbook types of situations, stu-dents have to create the experimental design themselves. During the performance, stustu-dents set up the experiment, they measure and document data. During data evaluation, students process data, give a solution and interpret the results. This description might suggest a linear order of steps. However, this is not intended. The steps can occur in different orders and loops.

Measuring experimental skills

Written tests are established instruments to assess experimental skills. But especially with regard to the performance dimension, their validity is in question (e.g. Shavelson, Ruiz-Primo, & Wiley 1999). Other approaches for the assessment of performance skills seem to be necessary (Ruiz-Primo & Shavelson 1996, Stebler, Reusser & Ramseier 1998, Garden 1999). Here, hands-on tests show their potential. However, in large-scale assessments the analysis of students’ performance in experimental tasks is usually only based on the products of experimenting, mostly documented in lab sheets (e.g. Stebler, Reusser, & Ramseier 1998, Ramseier, Labudde & Adamina 2011, Gut 2012). The processes of experimentation, like setting up the apparatus and making measure-ments, are considered only indirectly, insofar as they affect the products.

On the one hand, it is in question whether a product-oriented analysis which neglects process as-pects of experimenting yields adequate ratings compared to a process-oriented analysis. On the other hand, a process-oriented analysis that considers the quality of students’ actions in the per-formance dimension (e.g. Neumann 2004) is very resource-consuming. In order to justify the ad-ditional effort for a process based analysis, it has to be shown that ratings from a product-based analysis are insufficient predictors for ratings from a process-based analysis of the same hands-on test – at least for the performance phase.

Figure 1: Model of experimental skills: 3 dimensions, 6 components (adapted from Schreiber, Theyßen & Schecker 2009).

(3)

RATIONALE AND METHODS

Hypotheses

In our study we investigate correlations between ratings from a product-based and a process -based analysis of students performances in hands-on tests. In the performance dimension students get direct feedback from the experimental setup. A non-functional electric circuit e.g. may result in a series of changes, until the setup finally works. The lab sheet will only show the final result. Similar processes may occur during measurement. A product-based analysis only evaluates the documented (i. e. usually the final) results, while a process-based approach looks at the chain of students’ actions. Our first hypothesis is:

H1: Concerning the performance dimension, ratings from a product-oriented analysis are not highly correlated with scores from a process-oriented analysis.

Students prepare the experiment and evaluate their data mostly in written form, without handling experimental devices. We assume a close relationship between what they do and what they document. Thus, our second hypothesis is:

H2: Concerning the preparation and evaluation of the experiment, ratings from a product-oriented analysis are highly correlated with scores from a process-product-oriented analysis.

As a high correlation we define a correlation above 0.7 (Kendall Tau-b)

Methods

Tasks

For the comparison of the product- und process-based analysis we developed two experimental tasks for the domain of electric circuits in secondary school curricula (Schreiber et al., 2012). The first task is “Here are three bulbs. Find out the one with the highest power at 6 V”. In the second task the students get a set of wires and have to find the best conductor from three metals. Students have a set of apparatus and a pre-structured lab sheet at their disposal. The lab sheet is structured along our model of experimental skills, requesting to plan a suitable experiment, assemble the experimental setup, perform measurements, evaluate the data and draw conclusions. Both tasks are open-ended and the students have to structure their paths towards the solutions on their own. They are only assisted by written information on the necessary physics content knowledge.

Design

Table 1 shows the design of the study. It was embedded in a more extensive study concerning the comparison of different assessment tools for experimental skills (Schreiber 2012).

(4)

Table 1

Design of the study

138 upper secondary students, aged about 16 to 17, took part in this study. In a pre-test we meas-ured personal variables that are supposed to have an influence on students’ test performances: cognitive skills, self-concept concerning physics and experimenting in physics, and the content knowledge in the field of electricity. Established tests and questionnaires were adapted for this pre-test (Heller & Perleth 2000, Engelhardt & Beichner 2004, von Rhöneck 1988, Brell 2008). In the hands-on test the students worked on one of the two tasks described above. The use of two different tasks was due to the design of the more extensive project into which this study was em-bedded. The students were assigned to the two groups based on their pre-test results in such a way that a sufficient and similar variance of the personal variables was realized in both groups. In a training session the students were introduced to the hands-on test (structures of the tasks and handling of the devices). The training task was also taken from the domain of electric circuits (measuring the current-voltage characteristic of a bulb).

In the hands-on test, students worked with a set of electric devices and a pre-structured lab sheet (Figure 2, task 1). In the situation shown in Figure 2, the student documents his (inadequate) setup with two multimeters, a battery and a bulb in the lab sheet. The pre-structured lab sheet demands to clarify the question, to document the setup, to perform measurements and to interpret the results. Students can choose when and in which order they fill in the sheet. The lab sheet does

not specify a particular solution or approach. Students’ actions were videotaped and the lab sheets were collected.

Process-oriented analysis

The videos and the lab sheets were analysed ac-cording to the components of experimenting shown in Figure 1. The process-oriented analysis leads to a quality index for each student in each of these assessment categories. In a first step students’ actions in the videotape are assigned to one of the six components. A second step of analysis codes the qualities of intermediate sta-ges (e.g. whether an experimental setup is

cor-pre-test – cognitive skills, content knowledge,

self-concept 45 min

training – introducing the hands-on test 20 min Hands-on tests

group 1 and task 1 highest power

group 2 and task 2

best conductor 30 min

Figure 2: A student performed task 1 with the hands-on test.

(5)

rect, imperfect or wrong) and the development (e.g. whether an imperfect setup is detected and improved) (cf. Schreiber, Theyßen & Schecker 2012, Theyßen et al. 2013). The flow chart in Figure 3 illustrates an example of how the rating decisions are made. The result is a quality index on an ordinal scale with five levels. To secure validity and reliability of this analysis, several studies with high inferent expert-ratings and interviews were conducted (details: Dickmann 2009, Holländer 2009, Fichtner 2011, Dickmann, Schreiber & Theyßen 2012, Schreiber 2012). The evaluation of double coding yields a high objectivity of the ratings (Cohen’s Kappa .67).

Product-oriented analysis

For the product-oriented analysis only the students’ documentations in the lab sheets were ana-lysed with regard to the same six model components (skill in Figure 1). Each entry in the lab sheet is directly associated with an assessment category. The single criterion is the correctness of the entry. A development cannot be assessed since in most cases only one result is documented in the sheets. Thus, using the formal analysis scheme (Figure 3), in the product-oriented analysis only the levels 1, 2, and 5 can be scored. Again the objectivity in each assessment category is sat-isfying (Cohen’s Kappa >.62).

RESULTS

To test the hypotheses, rank correlations (Kendall-Tau b, ) between the quality parameters from the product-oriented and the process-oriented analysis were calculated for each category (Tab. 2). In all the four assessment categories that can be assigned to the preparation and the evaluation dimensions, the correlations are high ( . For components of the performance dimension, we found only medium or low correlations ( . Thus, both hypotheses can be confirmed. The high correlations in the planning and data evaluation dimensions can be explained by the data basis: In these dimensions the process-oriented analysis also refers mainly to the documenta-tions in the lab sheets. Only in a few cases the videos provided further information concerning

(6)

developments. Thus, regardless of the method of analysis, in the dimensions of planning and data evaluation the scores 1, 2 and 5 dominate. In contrast, in both assessment categories belonging to the per-formance dimension the process-oriented analysis largely prof-its from the videos. The videos provide relevant information about the develop-ments while the students work on the set up and make measuredevelop-ments. The low correlations be-tween the process-oriented and the product-oriented analysis are obviously caused by an informa-tion gap between the documented setups and measurements on the one hand and the actual setup- and measurement-processes in the video on the other hand.

A further result can be derived from Table 2: the sample size per category decreases over the course of the experiment. Whereas 138 students clarified the questions and created an experi-mental design in the beginning, only 117 students interpreted the results in the end. This is a no-ticeable dropout of about 15 %. The reason is the use of an open task format (Fig. 2). The stu-dents had to structure the approach on their own and without any assistance. Stustu-dents who e.g. did not complete the setup were in the following not able to measure and to document data.

CONCLUSIONS

We draw two conclusions from our results:

1. Comparison of a process-oriented and a product-oriented analysis

A product-oriented analysis seems to be sufficient to analyse students’ skills of preparing an ex-periment and evaluating data. But in order to account for performance skills adequately, hands-on tests with a process-oriented analysis of students’ actions seem to be necessary. These findings should be considered for the development of more valid assessment procedures.

2. Open task format and sample size

The use of an open task format in testing experimental skills (“Find out …”) causes a noticeable dropout of students during the test. For assessing the full range of experimental skills, we suggest a guided test with non-interdependent sub-tasks. Each sub-task should refer to a specific experi-mental skill. To allow for a non-interdependent assessment, the item should present a sample so-lution of the preceding step, e.g. a measurement-item should provide a complete experimental setup. We have started to work on such a test format.

Table 2

Correlations (Kendall-Tau b, ) between a product-oriented and a pro-cess-oriented analysis. The correlations are highly significant (**) or significant (*). The assessment categories are assigned to the three ex-perimental dimensions: preparation, performance and data evaluation. n: sample size.

dimension assessment categories  n

preparation apprehend the task .877** 138 create the experimental design .728** 138 performance set up the experiment .499** 130 perform & document measurements .221** 122 data

evalua-tion

process the data & give a solution .960** 122 interpret the results .775** 117

(7)

REFERENCES

American Association for the Advancement of Science (AAAS) (Ed.) (1993). Benchmarks for Science Literacy. New York: Oxford University Press.

Brell, C. (2008). Lernmedien und Lernerfolg - reale und virtuelle Materialien im Physikunter-richt. Empirische Untersuchungen in achten Klassen an Gymnasien zum Laboreinsatz mit Simulationen und IBE. In H. Niedderer, H. Fischler & E. Sumfleth (Eds.), Studien zum Physik- und Chemielernen, Vol. 74. Berlin: Logos.

Department for Education and Employment (DEE) (Ed.) (1999). Science - The National Curricu-lum for England. London: Department for Education and Employment.

Dickmann, M. (2009). Validierung eines computergestützten Experimentaltests zur Diagnostik experimenteller Kompetenz (unpublished bachelor thesis). Dortmund: Technische Univer-sität Dortmund.

Dickmann, M., Schreiber, N. & Theyßen, H. (2012). Vergleich prozessorientierter Auswertungs-verfahren für Experimentaltests. In S. Bernholt (Ed.), Konzepte fachdidaktischer Strukturi-erung für den Unterricht, (pp. 449–451). Münster: LIT.

Emden, M. & Sumfleth, E. (2012). Prozessorientierte Leistungsbewertung des experimentellen Arbeitens. Zur Eignung einer Protokollmethode zur Bewertung von Experimentierprozessen. Der mathematische und naturwissenschaftliche Unterricht (MNU), 65 (2), 68-75.

Engelhardt, P. V. & Beichner, R. J. (2004). Students’ understanding of direct current resistive electrical circuits. American Journal of Physics 72 (1), 98–115.

Fichtner, A. (2011). Validierung eines schriftlichen Tests zur Experimentierfähigkeit von Schülern (unpublished master thesis). Bremen: Universität Bremen.

Garden, R. (1999). Development of TIMSS Performance Assessment Tasks. Studies in Educa-tional Evaluation 25(3), 217–241.

Gut, C. (2012). Modellierung und Messung experimenteller Kompetenz. Analyse eines large-scale Experimentiertests. In H. Niedderer, H. Fischler & E. Sumfleth (Eds.), Studien zum Physik- und Chemielernen, Vol. 134. Berlin: Logos.

Hammann, M. (2004). Kompetenzentwicklungsmodelle: Merkmale und ihre Bedeutung - dargestellt anhand von Kompetenzen beim Experimentieren. Der mathematische und natur-wissenschaftliche Unterricht 57(4), 196–203.

Heller, K. A. & Perleth, C. (2000). Kognitiver Fähigkeitstest für 4.-12. Klassen, Revision (KFT 4-12+ R). Göttingen: Hogrefe.

Holländer, L. K. (2009). Validierung eines Experimentaltests mit Realexperimenten zur Diag-nostik experimenteller Kompetenz (unpublished bachelor thesis). Dortmund: Technische Universität Dortmund.

Klahr, D. & Dunbar, K. (1988). Dual Space Search During Scientific Reasoning. Cognitive Sci-ence 12, 1–48.

Neumann, K. (2004). Didaktische Rekonstruktion eines physikalischen Praktikums für Physiker. In H. Niedderer, H. Fischler & E. Sumfleth (Eds.), Studien zum Physik- und Chemielernen, Vol. 38. Berlin: Logos.

(8)

Mayer, J. (2007). Erkenntnisgewinnung als wissenschaftliches Problemlösen. In D. Krüger & H. Vogt (Eds.), Theorien in der biologiedidaktischen Forschung (pp. 177-186). Berlin, Heidel-berg: Springer.

National Research Council (NRC) (Ed.) (2012). A Framework for K-12 Science Education: Prac-tices, Crosscutting Concepts, and Core Ideas. Washington, DC: The National Academies Press.

Ramseier, E., Labudde, P. & Adamina, M. (2011). Validierung des Kompetenzmodells HarmoS Naturwissenschaften: Fazite und Defizite. Zeitschrift für Didaktik der Naturwissenschaften, 17, 7–33.

Ruiz-Primo, M. A. & Shavelson, R. J. (1996). Rhetoric and Reality in Science Performance As-sessments: An Update. Journal of Research in Science Teaching 33 (10), 1045–1063.

Schreiber, N. (2012). Diagnostik experimenteller Kompetenz - Validierung technologiegestützter Testverfahren im Rahmen eines Kompetenzstrukturmodells. In H. Niedderer, H. Fischler & E. Sumfleth (Eds.), Studien zum Physik- und Chemielernen, Vol. 139. Berlin: Logos. Schreiber, N., Theyßen, H., Schecker, H. (2009). Experimentelle Kompetenz messen?! Physik

und Didaktik in Schule und Hochschule, 8 (3), 92-101.

Schreiber, N., Theyßen, H. & Schecker, H. (2012). Experimental Competencies In Science: A Comparison Of Assessment Tools. In C. Bruguière, A. Tiberghien & P. Clément (Eds.), E-Book Proceedings of the ESERA 2011 Conference, Lyon France. Retrieved from:

http://www.esera.org/media/ebook/strand10/ebook-esera2011_SCHREIBER-10.pdf (29.11.2013).

Shavelson, R. J., Ruiz-Primo, M. A. & Wiley, E. W. (1999). Note on Sources of Sampling Vari-ability in Science Performance Assessments. Journal of Educational Measurement, 36 (1), 61-71.

Stebler, R., Reusser, K. & Ramseier, E. (1998). Praktische Anwendungsaufgaben zur integrierten Förderung formaler und materialer Kompetenzen - Erträge aus dem

TIMSS-Experimentiertest. Bildungsforschung und Bildungspraxis 20 (1), 28–54.

Theyßen, H., Schecker, H., Gut, C., Hopf, M., Kuhn, J., Labudde, P., Müller, A., Schreiber, N., Vogt, P. (2013). Modelling and Assessing Experimental Competencies in Physics. In C. Bruguière, A. Tiberghien & P. Clément (Eds.), 9th ESERA Conference Contributions: Top-ics and trends in current science education - Contributions from Science Education Research (pp. 321–337). Dordrecht: Springer.

von Rhöneck, C. (1988). Aufgaben zum Spannungsbegriff. Naturwissenschaften im Unterricht - Physik/Chemie, 36(31), 38–41.