CHAPTER 8: CONCLUSION
8.1 Contributions
In this dissertation, we presented these problems in a study of our industrial part- ner’s quality modelling approach and proposed ways to improve the state of the art of quality modelling practices. We explored three dimensions to improve quality modelling practices: subjectivity, evolution, and composition.
During the course of this dissertation, we made the following contributions:
8.1.1 Handling Subjectivity
We believe that part of the problem with using thresholds and hard rules is that dif- ferent people have varying opinions on what is good code. We proposed a method to
identify high-level, subjective quality indicators and tested it on the problem of efficiently detecting the presence of anti-patterns. Existing, state of the art techniques are based on
detection rules written by experts and classify code as either as “clean” or “unclean”. This binary classification was shown to be inadequate because there were only a few cases where a class was unanimously considered an anti-pattern in the corpus studied.
137 Instead, we built Bayesian models to predict the probability that a quality engineer would consider that the design of a class is a defect. In these models, subjectivity is encoding using Bayes’ theorem.
The models built using our method produced results equal or superior to existing rule-based models, and allowed for a flexible browsing of the candidate classes. Addi- tionally, we showed that by returning ranked results, we can improve the efficiency of a manual validation of the set of candidates.
8.1.2 Supporting Evolution
Suggesting improvements is a key aspect of what IV&V teams do. It is therefore essential to get a better understanding of what type of quality evolution patterns exist in software systems. Using an anti-pattern detection model, we presented a technique
to track the quality of evolving software entities. The vast majority of existing quality
models evaluates the quality of a snapshot of a system. There are many cases when a team wants a dynamic view of the system under development in order to determine if development is improving or degrading its quality. For example, our partner is interested in identifying if the quality of certain parts of a system has degraded for contractual rea- sons, or to evaluate if a development team is adequately correcting previously identified problems. We proposed a technique to use a quality model to find interesting patterns in the evolution of the quality of classes. We consider sequences of quality scores as a signal and applied signal-based data-mining techniques. We were able to find groups of classes sharing common evolution habits.
In a study of two open-source systems, we found that vast majority of classes ex- hibiting the symptoms of Blobs are added to the program with these symptoms already present. This type of anti-pattern is consequently not the result of a slow degradation as sometimes expressed in the literature. Furthermore, many of these classes that seem to implement too many responsibilities play roles in design patterns, a structure that is typically thought to be a sign of good quality. In the systems studied, we also found that code correction is rare, more often than not, bad classes are replaced.
8.1.3 Composing Quality Judgements
Software is developed using different levels of abstraction, yet quality models gen- erally do not explicitly consider these abstractions. Our partner however expressed the need for quality models at various levels. In the company, IV&V team members want detailed information concerning what parts of the system need additional testing/inspec- tion, yet managers want to use a model to get an aggregate view of quality. In order to estimate a higher-level view of quality, we performed a detailed study of different
aggregation strategies that combine quality information from a level of granularity to
another.
We performed a study to identify high-change classes using both method and class- level metrics. We tested both traditional aggregation techniques and a new importance-
based approach. We found that building a simple model relying only on size (measured
by the number of statements) produces the best global inspection efficiency. This model is however outperformed by our importance-based approach when trying to identify the most changed classes. Since an IV&V team does not inspect the totality of results, and focuses on the riskiest classes, we showed that our approach, using the PageRank algo- rithm on statically generated call-graphs outperforms existing aggregation techniques. We would like to note that our use of statically constructed call-graphs to approximate runtime behaviour in quality models is new.
8.1.4 Application to Web Applications
Finally, we applied our importance-based approach to the assessment of the naviga-
bility of web sites. Many modern applications are now web-based, and thus we tested
our importance models to evaluate whether or not a web site can be easily navigated by a user. We adapted an existing, low-level quality model and used an aggregation strategy to evaluate the site as a whole. Our technique could successfully differentiate between random sites and good sites (winners of Webby awards). This work is one of the few empirically validated studies in the field. This work served to connect all three previous contributions. The mechanisms used support subjectivity; we showed how the model
139 can be used to recommend improvements, and finally, our aggregation model produces managerial-level quality estimates from code-level data.