• No results found

While the negative impact of cloning on program correctness has been stated qualitativelymany

times, its quantitative impact—and thus its significance—in practice remained unclear. Further-

more, while cloning in source code had been studied intensely, little was known about its extent and consequences in other software artifacts.

The following sections summarize our empirical results on the impact of cloning on program cor- rectness and the extent of cloning in requirements specifications and Matlab/Simulink models. Then, we summarize the cost model that quantifies impact of cloning on maintenance efforts. 10.1.1 Impact on Program Correctness

We investigated four research questions to quantify the impact of code cloning on program correct- ness:

RQ 1:Are clones changed independently?

Yes. About half the clone groups in the analyzed systems were type-3 clone groups and thus had differences beyond variable names and literal values. Changes to cloned code that arenotperformed equally to all clones hence frequently occur in practice.

RQ 2:Are type-3 clones created unintentionally?

Yes. A substantial part of the differences between the analyzed clones was unintentional. Many of the developers were thus not aware of all the existing clones when modifying code. However, the ratio of intentional w.r.t. unintentional differences varied strongly between the analyzed systems, indicating differences in the amount of cloning awareness.

RQ 3:Can type-3 clones be indicators for faults?

Yes. Analysis of type-3 clones uncovered 107 faults in productive software. The ratio of type-3 clones that indicated faults, however, varied between the analyzed systems. Software with more unintentionally inconsistent changes also contained more type-3 clones that indicated faults. RQ 4:Do unintentional differences between type-3 clones indicate faults?

10 Conclusion

Yes. About every second unintentional difference between type-3 clones indicated a fault. Lack of awareness of cloning during maintenance thus significantly impacts program correctness.

Summary The study results show that a lack of awareness of cloning is a threat to program correctness. While the analyzed systems varied in their share of unintentional differences—and thus the amount of cloning awareness among their developers—the negative impact of unintentionally inconsistent changes was uniform: about every second unintentionally inconsistent change had a direct impact on program correctness. These results thus give strong indication that awareness of cloning is crucial during software maintenance.

In addition, the study showed that awareness of cloning varies between projects—it thus cannot be taken for granted in industrial software engineering. Clone control is required to achieve and maintain awareness of cloning to alleviate the negative impact of existing clones.

10.1.2 Extent of Cloning

Besides source code, further software artifacts are created and maintained during the lifecycle of a software system: requirements specifications play a pivotal role in communication between cus- tomers, requirement engineers, developers and testers; Matlab/Simulink models are replacing code as primary implementation artifact in embedded software systems. However, cloning has not pre- viously been studied in these artifacts. We investigated five research questions to shed light on the extent and impact of cloning in requirements specifications and Matlab/Simulink models.

RQ 5:How accurately can clone detection discover cloning in requirements specifications?

Our clone detector ConQAT achieved high precision values for the 28 analyzed industrial require- ments specifications: 85% in the worst case, 99% on average. Tailoring is, however, required to achieve such high precision. These results show that clone detection is suitable to detect cloning in requirements specifications.

RQ 6:How much cloning do real-world requirements specifications contain?

The amount of cloning varied substantially across the analyzed specifications. While some con- tained no cloning at all, others exhibited a size increase over 100% due to cloning. The highest clone coverage values ranged at 51.1% and 71.6%.

RQ 7:What kind of information is cloned in requirements specifications?

We discovered a broad range of different information categories present in cloned specification fragments—cloning is not limited to a specific kind of information. Consequently, clone control cannot be limited to specific categories of requirement information.

RQ 8:Which impact does cloning in requirements specifications have?

Inspections are an important quality assurance technique for requirements specifications. The cloning induced size blow-up increases effort required for inspections—in the worst case by an estimated 13 person days for one of the analyzed specifications. Cloning thus increases quality assurance effort for requirements specifications.

10.1 Significance of Cloning

In addition, we saw evidence that requirement cloning can result in redundancy in the implemen- tation. Besides corresponding source code clones, we found cases in which cloned specification fragments had been implemented independent of each other. Besides increased implementation effort, this causes behaviorally similar code that is not the result of source code copy & paste. RQ 9:How much cloning do real-world Matlab/Simulink Models contain?

The analyzed industrial Matlab/Simulink models contained a substantial amount of cloning. While the detection approach produced false positives, the developers agreed that awareness of many of the detected clones is relevant for software maintenance. Cloning thus occurs in Matlab/Simulink models and needs to be controlled during maintenance, as well.

Summary Cloning is not limited to source code, and neither is its negative impact. Cloning abounds in requirements specifications and Matlab/Simulink models—it hence needs to be con- trolled in them, too, to reduce the negative impact of cloning on engineering efforts.

Clone control measures are likely to differ for requirements specifications and Matlab/Simulink models, however. Limitations of the existing abstraction mechanisms are a root cause for cloning in Matlab/Simulink models. Since corresponding clones cannot easily be removed without changes to the Matlab/Simulink environment, clone control needs to focus on their consistent evolution. In contrast, for requirements specifications, no abstraction mechanism limitations hinder the clone consolidation: many of the analyzed specifications did not contain any cloning at all. Consequently, clone control for them can put more emphasis on the avoidance and removal of cloning.

10.1.3 Clone Cost Model

Besides the empirical studies, we have presented an analytical cost model that quantifies the eco- nomic effect of cloning on maintenance efforts and field faults. It can be used as a basis for as- sessment and trade-off decisions. The model produces a resultrelativeto a system without cloning and thus requires substantially less parameters—and instantiation effort—than general purpose cost models that produce absolute results.

Instantiation of the cost model on 11 industrial systems indicates that cloning induced impact varies significantly between systems and is substantial for some. Based on the results, some projects can achieve considerable savings by performing active clone control.

Summary The cost model complements the empirical studies in two ways. First, it completes our understanding of the impact of cloning: instead of focusing on isolated aspects or activities, it quantifies its impact on all maintenance activities and thus on maintenance efforts and faults as a whole. Second, it makes our observations, speculation and assumptions explicit. This explicitness offers an objective basis for scientific discourse about the consequences of cloning.

10 Conclusion