The final step of implementing the approach is integrating it in the ACM framework to arrive at a semi-automatiac trace link creation approach. The integration can be achieved using a wide variety of machine learning libraries including Weka, Encog13, PyBrain14and scikit15. For the current implementation the Weka API was selected since the framework is written in Java. The aim of theTraceability creationstage of the framework is to assist users with their inter trace link creation tasks as part of setting up Framework Data. The input to inter trace link creation is supplied by users in the form of candidate links stored in a.xmllinks file, which is discussed in Chapter 6. Following the training of the selected classifier, the framework runs the model on test data obtained from the.xmllinks file and classifies each data instance. The user is presented with a final.xmllinks file, which contains trace links as returned by the selected model.
8.10
Conclusions
The poor performance of the linear model, the perceptron, empirically confirms the widely held belief about the complexity of this problem. Using the multilayer perceptron the experiments show a prediction accuracy of 85.7% in cross validation, which is closely followed by the J48 algorithm at 85.2% accuracy. The differences in the accuracy values of the models obtained following cross-validation can be explained by the diversity of the modelled systems. However, the accuracy results prove that using machine learning to aid the automation of trace link creation is a viable approach and it is worth further investigating. Besides observing the benefits of applying the approach in this complex problem domain, which does not assume the use of specific artefacts or development conventions, the shortcomings machine learning approaches may present are also to be discovered.
Potential improvements to the current implementation include extending the breadth of programs considered by utilising different systems providing further artefact types. Further experiments are required to analyse the correlation between systems and features. The solution also forms a basis for further work in various other areas, such as visualising results, and allowing users to edit trace links predicted by the model. Since each classifier is trained with default settings, an investigation into fine-tuning the parameters of classifiers may provide further classification accuracy gains. Specifically, since the J48 classifier shows promising results, pruning methods could be utilised [216].
13http://www.heatonresearch.com/encog 14http://pybrain.org
9
CHAPTER
NINE
EVALUATION
This chapter describes the evaluation strategy to analyse the applicability of the framework in software development scenarios and to test the functionality of its individual components. It starts with summarising the hypotheses presented in Chapter 1, and the requirements of the proposed approach discussed in Chapter 4. Thereby, the objectives of the evaluation are established and the evaluation questions are formulated. Subsequently, appropriate research methods pertaining to each question are selected. This provides the basis for evaluating the proposed approach and its implementation, the ACM framework. Table 9.1 shows the relationships between hypotheses and requirements, and the process of deriving corresponding evaluation questions and research methods. TheEvaluation aimcolumn highlights the evaluation concerns. Next, the evaluation process is designed and implemented. These steps involve data collection, data analysis and testing the framework against the collected data. Finally, the results are analysed and conclusions are drawn while taking into account validity considerations.
9.1
Evaluation Objectives
The aim of the evaluationis threefold: the main objective is to provide a verification of the hypotheses usingempirical software engineeringmethods to critically evaluate the degree to which the proposed solution addresses the problem at hand. Through evaluation, it is revealed whether the stated hypotheses are realistic, and the strengths and weaknesses, and areas of potential improvements are also identified. Secondly, the correctness of the results achieved at each stage of the consistency management process is tested using software engineeringvalidation
and verification methods. Lastly, the performance of the solution is analysed. A solution is
suitable for wider adoption if, besides other criteria, it meets demands to scale under varying workloads. Performance is measured using appropriatemetrics. Table 9.1 also highlights thatH1
is not mapped to a requirement and an evaluation question. The reason for this choice is the fact
Hypotheses Requirements Evaluation Aim Evaluation Questions Evaluation Methods
H1
Hypothesis
Design & implementation
of proposed approach
H2
R3, R4
Hypothesis
Q1
Case study
H3
R1
Hypothesis
Q2
Design & implementation
of proposed approach,
case study
H4
R2, R4
Hypothesis
Q3
Case study
R6
Performance
Q4
Performance metrics
Func. Req. & R4
Correctness
Q5
Validation & verification
R5
Outside the scope of this thesis
Table 9.1: Derivation of evaluation questions and methods from the hypotheses and requirements.
that the feasibility of the proposed approach expressed inH1is investigated through the design and implementation of the ACM framework.