2.4 Software Measurement for Functional Programming Languages
2.4.2 Validation
A common criticism of software measurement in the past has been the lack of rigorous validation. Van den Berg presents a case study demonstrating the exper- imental validation process he used. For the case study he used structure metrics
for Miranda type expressions. The structure of Miranda type expressions was represented by the grammar :
typeexp ::= Num | Bool | Char |
Var num | L typeexp | T [typeexp] | F typeexp typeexp
With this grammar a large class of Miranda types, excluding algebraic and abstract types, can be represented. For instance the type :
(* -> bool) -> [*] -> ([*],[*]) would be represented in the grammar as : (F (F (Var 1) Bool)
(F (L (Var 1))
T [(L (Var 1)),(L (Var 1))]))
From the grammar for type expressions Van den Berg derived the following set of axioms that a structure metric on types must fulfil.
m(L t) > m(t) (3) m(T [t1, . . . , tn]) > max(m(t1), . . . , m(tn)) (4) m(T [t1, . . . , tn]) = m(T (perm[t1, . . . , tn])) (5) m(F t1t2) = m(F t2t1) (6) m(T [t1, . . . , tn+1]) > m(T [t1, . . . , tn]) (7) m(T [t1, . . . , tn]) > m(L ti), i = 1, . . . , n (8) m(F t1t2) > m(T [t2, t1]) (9) where n ≥ 1
Axiom 3 states that the metric value for a list of elements of type t should be higher that for a single element of that type, and thus makes the assumption that lists add to the complexity of a type expression.
Axion 4 states that the metric value for a tuple should be a greater value than any of the metric values for the types contained in the tuple. This makes the assumption that a tuple adds to the complexity of a group of types.
Axiom 5 states that the metric values for a tuple should not be affected by the order of the elements of a tuple.
Axiom 6 states that the metric values for a function type should not be affected by the ordering of the element types in the function type.
Axiom 7 states that the metric value for a tuple should increase as the number of elements in the tuple increases.
Axiom 8 states that the complexity of a tuple, e.g. (Bool,Char), is greater than the complexity of a list of any of the component types, e.g. [Bool] and [Char]. The reason for this assertion is that to understand the tuple it is necessary to understand two types and the tuple constructor, while to understand a list it is only necessary to understand one type and the list constructor.
Axiom 9 states that the metric value for a function type should be greater than the metric value for a tuple type containing the same element types.
For the experiment a simple sum metric that conforms to the above axioms was defined in the following manner:
m(Num) = CN m(Char) = CC m(Bool) = CB m(V ar n) = CV (n) m(L t) = CL+ m(t) m(F t1t2) = CF + m(t1) + m(t2) m(T [t1, . . . , tn]) = CT + m(t1) +· · · + m(t2)
Axiom tLHS(sec) tRHS(sec) Eqn. 3 LHS > RHS 19.0 08.0 Eqn. 4 LHS > RHS 21.6 10.6 Eqn. 5 LHS = RHS 33.8 29.7 Eqn. 6 LHS = RHS 15.2 20.7 Eqn. 7 LHS > RHS 25.6 20.5 Eqn. 8 LHS > RHS 24.6 12.7 Eqn. 9 LHS > RHS 19.7 12.7
Table 1: Results from the validation of metrics for Miranda type expressions.
The experiment consisted of presenting a type expression to a subject who was then requested to produce a function with that type. The function need not produce any sensible output, merely conform to the type signature. The time was measured from the instant the subject was shown the type expression until the instant they completed the task. The subjects were 16 first year undergraduate students, each answering 40 questions. Each subject was shown the same ques- tions, but in a random order. Showing individual type expressions to the subjects in a random order avoids the results being biased by a “learning effect”, whereby answering one question trains the subject for a later question, resulting in reduced time to answer the later question.
The type expressions used in the experiment were devised so that they would fit the axioms described above. For instance, the two type expressions [char -> bool] and char -> bool could be used to test Axiom 3.
Only those type expressions that were correctly answered were considered and those with extreme time values2 were discarded. The average times taken to
correctly complete the type expression questions were then used to test the axioms. The results from this experiment are shown in Table 1.
These timing results were then used to calculate coefficients, e.g. CN, to be
inserted in the metric described above. This resulted in a metric that can be used to assess the complexity of type expressions in Miranda programs.
Program structure is often thought to be an important aspect of good program construction. Van den Berg derived metrics for program structure using control flowgraphs. He then performed an experiment to determine programmers perfor- mance on structured versus unstructured function definitions of varying sizes.
The notion of structured and unstructured used in this work was based on Fenton’s [32] control flow work. Van den Berg classifies flowgraphs for Miranda as structured or unstructured by the paths through the flowgraphs. A path through a flowgraph is a sequence of consecutive nodes from the start node to the stop node. Van den Berg defines a D-structured path as a sequence of pattern matching nodes followed by a sequence of guard nodes, and possibly an expression node and finally a stop node. He further defines a path that is not D-structured as X-structured. He then classifies a function as structured if all paths through its flowgraph are D-structured, otherwise the function is classified as unstructured.
The experiment was performed in a similar manner to the experiment de- scribed earlier for type expressions. The following conclusions were drawn from the experiment.
1. Subjects need significantly less time to obtain an answer to structured func- tional definitions than to unstructured functional definitions.
2. Subjects give correct answers to somewhat larger structured functional def- initions significantly more often than they do to unstructured definitions of comparable size.
3. Subjects need significantly more time to obtain an answer to larger function definitions than to smaller ones.
4. Subjects give correct answers for larger structured function definitions sig- nificantly more often than they do for smaller ones.
The most interesting conclusion here is 4, which appears to show that subjects in the experiment were more careful in their answers to larger problems than they were for their answers to smaller problems. This is also suggested by conclusion
3 which shows that the subjects spent less time answering the smaller problems than they did for the larger problems.
2.4.3
Summary
Van den Berg concluded that it was not possible to make a general conclusion to the question ‘Do students who learn functional programming write better pro- grams?’ on objective grounds, particularly as there was little agreement between experts on what constituted a readable Miranda program. However he did note that students who learnt functional programming tended to use more functions in imperative languages than those students who had not. No other work relating software measurement to functional programming is known to the author.
In general there there has been little activity in the field of software engineering for functional programming. There has been a little work on design paradigms for functional programming by Russell [84] and Wakeling [99], but the majority of the software engineering research has been focused on tool support, such as support for Haskell in integrated development environments such as Eclipse [38] and Visual Studio [65].
By far the largest body of work in the area of software engineering for func- tional programming has been in the study of debugging and tracing tools, which are typified by work such as that of Runciman and his co-workers [20] on trac- ing program execution and that of Reinke [81] on visualising and animating such traces.
Most recently, work has also been started by Li, Reinke and Thompson [60] in the area of tool support for refactoring functional programs. The application of software metrics to refactoring is examined in Section 2.7.