Validity and Correctness - Problem-solving in undergraduate mathematics and computer aided asse

When a student submits a response in CAA, it is their mathematics that we want to judge, not their ability to write correct syntax. For this reason, and others discussed below, submitting answers in STACK is a two step process. In the ﬁrst step, the system checks that the student’s answer is valid, and does not penalise a student whose answer fails to be adjudged as such. In the second step, the system checks the answer’s correctness by establishing suitable equivalence with the teacher’s answer.

8.2.1 Validity

The ﬁrst job in deciding an answer’s correctness is to determine whether it can be parsed into a valid mathematical expression. A mathematical expression is deﬁned to be either an atom, e.g.

a number or variable, or an operator with the correct number of operands. This deﬁnition is recursive: an operand may be another atom, or an operator itself. For example, 3x³⁺ln⁽x^{) =}0 is an expression, consisting of the operands 3x³⁺ln⁽x⁾and 0, and the operator⁼. Its ﬁrst operand is another operator,+, with the operands 3x³and ln(x); its second operand is the number 0. 3x³and ln⁽x⁾are also operators with operands.

In Maxima, an expression is either a function (operator) or an atom (number or variable), where the arguments of any function are themselves expressions. Thus, the expression 3x³+ ln⁽x^{) =}0 can be represented with the following tree structure:

0 +

^ 3

3 x

Figure 8.2: A Maxima expression tree

In this example, we have the unary function ln, the binary functions⁼and ^, and the n-ary functions+and×. Were any function to have too many or too few arguments, for instance 3x ⁼, we would not have a valid expression. The ﬁrst task in determining whether a student’s answer is valid is to determine its syntactic validity: is it a valid expression in Maxima? This is a complicated process, but includes check that each function has the correct number of arguments, and brackets match. Once this is complete, the system checks that an answer will not compromise the system when input to Maxima, and that it does not circumvent the question by employing Maxima fuctions that it should not, e.g. ^intin an integration question.

Checks for syntactic validity and malicious code are performed on all answers input to STACK. Restrictions on functions are determined per question. Besides disallowing students to use certain functions, a teacher may choose to impose further restrictions on students’ answers.

Primary among these is checking that a student’s answer is of the correct type: that is to say, is it the right kind of mathematical object? Two other validity checks relate to numbers in answers. A teacher may choose to forbid ﬂoating-point numbers in an answer, requiring students to use only fractions; further they may require that all fractions be in lowest terms.

When a student submits their answer to STACK, the required validity tests are run on their input and if their answer passes all the tests, it is presented to them as typeset mathematics with the message “Your answer was interpreted as:”. This allows the student to check that they have entered what they intended to—as answers are submitted in Maxima syntax but presented back to students as typeset mathematics—and that any ambiguities in their answer have been correctly interpreted. A student may alter their answer and resubmit it for validation as many times as they like, without losing marks for an invalid answer.

8.2.2 Correctness

Once a student’s answer has passed the relevant validity tests and they are happy with it, they submit it again for marking. In Section 9.1 we discuss the path that their answer takes through the system, but for now we assume that the student and teacher each have a single answer.

This comparison is done using an answer test.

An answer test is a predicate function that returns ^trueif the student’s and teacher’s answers are ‘the same’ to some suitable degree, and^falseif they are not. The prototype test is one of mathematical equivalence, taking the two answers^SA(from the student) and^TA(from the teacher) and determining whether

simplify(SA - TA)=0.

This test only works in the case where the answers submitted are mathematical expressions without equals signs, and despite its simplicity this test has its pitfalls. Even in cases where the expressions used are elementary functions of a single real variable, establishing equivalence with zero is formally undecidable (i.e., equivalent to the halting problem) (Richardson, 1968).

Formal undecidability notwithstanding, for any answer that a student is ever likely to submit, the equivalence or otherwise of^SA⁻^TAto zero is established.

Besides eithertrueorfalse, answer tests return two further pieces of information. The ﬁrst is feedback, which can be displayed to the student at the teacher’s discretion. The second is a note, which records the properties of the student’s answer identiﬁed by the system. The note

is stored in the system should the teacher wish to identify trends in the answering of questions.

So, besides deciding whether a student’s answer is correct or not, it is also preferable if an answer test is able to return worthwhile information on an answer that is incorrect, much as a teacher may when marking work.

In document Problem-solving in undergraduate mathematics and computer aided assessment (Page 177-181)