Classroom evaluation study 1 - Design and Implementation

4 Design and Implementation

5.2 Classroom evaluation study 1

An experiment was designed and conducted to evaluate the effectiveness of KERMIT and its

contribution to learning. The study involved students learning database modelling, and compared their learning using the fully functional KERMIT against a control group that used a cut-down

version of KERMIT. The system used by the control group provided no feedback except for the

would be similar to a classroom situation where students only get to see the ideal solution. The study also assessed the students’ perception of the two systems.

The evaluation took place at Victoria University, Wellington (VUW). Twenty eight volunteers from students enrolled in the Database Systems course (COMP302) offered by the School of Mathematical and Computer Sciences at VUW participated. The course is offered as a third year computer science paper, which teaches ER modelling as defined by Elmasri and Navathe [Elmasri & Navathe, 1994]. The students who participated in the study had previously learnt database modelling concepts in the lectures and labs of the course.

During the design phase of the study, we considered other options for the participant population, such as mature students from a continuing education course. Although they can be viewed as a respectable population for the evaluation study since they are generally highly motivated, they are typically low in computer literacy, having limited experience using computers. Therefore, they may struggle with the interface of KERMIT, distracting them from the goal of learning ER modelling.

Moreover, since they are not typical students learning ER modelling, the results cannot be generalised to typical students.

5.2.1 Process

The study was conducted in the computer laboratories of the School of Mathematical and Computer Sciences at VUW. The experiment involved two versions of KERMIT:

• Experimental group - KERMIT with comprehensive feedback capabilities;

• Control group - a trimmed down version of KERMIT that only offered the complete

solution as feedback (named ER-Tutor).

The interfaces of both systems were similar, but with the option of selecting feedback and the feedback textbox missing from the ER-Tutor. The study involved two one-hour sessions in

succession. As participants randomly chose a session, the students in the first session were treated as the experimental group and the other as the control group.

Although the study was planned for a two-hour session, each group was only given an hour due to miscommunication between us and the lecturer of COMP302, and resource shortages at VUW. Moreover, the lab allocated to the evaluation study contained only 18 terminals, which is insufficient for a population of 28 students.

Each session proceeded in four distinct phases: • Pre-test

• Interacting with the system (KERMIT/ER-Tutor)

• Post-test • Questionnaire

Initially each student was given a document that contained a brief overview of the study and a consent form. After signing the consent form, the students sat a pre-test. At the completion of the pre-test, they began interacting with the system. At the end of the interaction, participants were given a post-test and a questionnaire. The time allowed for completing the tests was not strictly supervised. As soon as the students completed their pre-test they started interacting with the system. Students were asked to stop interacting with the system after approximately 45 minutes into the study, as the study had to be concluded within an hour.

The following subsections explain each of the four phases of the experiment in depth, discussing the design decisions made.

5.2.2 Pre- and post-test

A pre- and post-test was included in the study to evaluate the students’ knowledge of ER modelling before and after interacting with the system. The pre-test verifies that the experimental and control groups have similar knowledge in ER modelling prior to the study, and that the two groups are comparable. Students’ learning during the session can be quantified by comparing the results of both tests. If students, on average, scored higher in the post-test, it can be concluded that they acquired knowledge by interacting with the system. The increases or decreases in score of the two groups (experimental and control) can be compared to approximate the difference in effectiveness of KERMIT and the control system. In other words, if the experimental group show a higher

improvement in the post-test compared to the control group, it suggests that students learn more interacting with KERMIT than with ER-Tutor (the control system).

Since the results of the pre- and post-test are compared to assess knowledge acquisition, both the pre-test and the post-test should either be identical or of a similar complexity. Although ideally the pre- and post-tests should be identical for comparison, students may recall test questions in their second attempt. To minimise any prior learning effects we designed two tests (A and B) that contained different questions, but of approximately the same complexity. In order to reduce any bias on either test A or B, the first half of each group was given test A as the pre-test and the remainder were given B as the pre-test. The students who had test A as their pre-test were given test B as their post-test and vice versa. Therefore the effect of bias from a particular test was reduced.

Designing the tests A and B was complex due to the limited time that was allocated for each test and the goal of evaluating students’ knowledge in ER modelling. The tests were designed to be completed in less than ten minutes. They contained two questions: a multiple choice question to choose the ER schema that correctly depicted the given scenario and a question that involved designing an ER schema, asking the students to design a database that satisfied the given set of requirements. The pre- and post-test used are given in Appendix B.

The multiple choice question requires less effort from students and demands less time. Since guessing may be involved in making a selection, a single multiple choice question does not give a good evaluation of the student’s knowledge. The rationale behind adding a design question was to comprehensively evaluate the student’s knowledge. Moreover, as the question is close to a database design task in the real world, sound assumptions on the student’s performance in designing databases in the real world can made. This type of question is also excellent in evaluating the effectiveness of KERMIT in supporting students learning to design databases, as KERMIT offers

problems of a similar fashion.

5.2.3 Interaction with the system

All the participants interacted with either KERMIT or ER-Tutor, composing ER diagrams that

satisfied the given set of requirements. They worked individually, solving problems at their own pace. The set of problems and the order in which they were presented was identical for both the experimental and control group. The students who were using KERMIT were required to complete

the current problem before moving on to the next one, whereas the students in the control group were free to skip problems as they pleased.

A total of six problems were ordered in increasing complexity (see Appendix D for the list of problems and their ideal solutions). The first three in the list were introductory level problems. It was expected that an average student would spend approximately ten minutes each on these. The fourth problem involved constructing a database model of a moderate complexity. We expected students to spend about half an hour on this. The two final problems were challenging and call for a considerable amount of work. They were aimed at more able students, anticipating that they would complete the initial problems quickly.

5.2.4 System assessment

The system assessment questionnaire recorded the student’s perception of the system. The questionnaire contained a total of fourteen questions. Initially students were questioned on previous

experience in ER modelling and in using CASE tools. Most questions asked the participants to rank their perception on various issues on a Likert scale with five responses ranging from very good (5) to very poor (1), and included the amount they learnt about ER modelling by interacting with the system and the enjoyment experienced. The students were also allowed to give free-form responses. Finally, suggestions were requested on enhancement of the system. They included questions such as “what did you find frustrating about the system and any suggestions for improving the system”.

The questionnaire is included in Appendix C.

In document An intelligent teaching system for database modeling. (Page 82-86)