• No results found

This section first introduces clone coupling as an explicit criterion to evaluate relevance of clones for software maintenance. Based thereon, it introduces clone detection tailoring as a procedure to achieve accurate clone detection results. Its goal is to remove false positives—clone candidates that are irrelevant to software maintenance due to a very low coupling—from the detection results, while keeping relevant clones, to improve accuracy.

8.2.1 Clone Coupling

The fundamental characteristic of relevant clones causing problems for software maintenance is their change coupling, i. e., the fact that changes to one clone may also need to be performed to its siblings. This change coupling is the root cause for increased modification effort and for the risk of introducing bugs due to inconsistent changes to cloned code, requirements specifications or models during software maintenance.

8.2 Clone Detection Tailoring

The coupling between clone candidates has a direct impact on software maintenance efforts. If clone candidates are coupled, each change to one also needs to be performed to its siblings. Each time one clone candidate is changed, effort is required for location, consistent modification and testing of the other clone candidate(s). In case the others are not modified, an inconsistency is introduced into the system. If the change was a bug fix, the unchanged clones still contains the bug. If, on the other hand, clone candidates are not coupled, a change to one never affects its siblings, requiring no additional effort for location, modification and testing.

This impact of cloning on modification effort is largely independent of other characteristics of clone candidates such as, e. g., their removability. Consequently, due to its implications for maintenance efforts, we propose to employ clone coupling as a criterion to evaluate the relevance of clone can- didates for software maintenance.

8.2.2 Determining Clone Coupling

To use clone coupling as a relevance criterion, we need a procedure to determine it on real-world software systems. To be useful in practice, this procedure needs to be broadly applicable. We propose to employ developer assessments of clone candidate groups to estimate coupling, since they are not restricted to a specific system type, programming language, or analysis infrastructure. More specifically, assessors have to answer the following question:

Relevance Question 1 If you modify a clone candidate during maintenance, do you want to be informed about its siblings to be able to modify them accordingly?

This way, developers estimate whether they get a positive return on their effort to inspect the sib- lings when performing a modification to a clone candidate. The question partitions assessed clone candidate groups into two classes—relevantclone groups whose expected coupling is high enough

to impede software maintenance, and groups whose expected coupling is so low that they areirrel- evantto software maintenance.

8.2.3 Tailoring Procedure

The steps of the tailoring procedure are depicted in Figure 8.1. First, the quality engineer executes the clone detector with a tolerant initial configuration that aims to maximize recall. Second, devel- opers assess coupling of the detected clone group candidates to identify false positives. Coupling is assessed on a sample of the candidate clone groups—assessment of all clones is typically too expensive1. All candidate clone groups classified as uncoupled are treated as false positives. If no

false positives are found, clone detection tailoring is complete.

If false positives are found, the clone detector configuration needs to be adapted to reduce the amount of false positives in the detection results. Which strategy is used for this typically depends on the detected false positives. The clone detector is then executed with the adapted configuration.

8 Method for Clone Assessment and Control

Run clone detector

False posit.? Assess clone candidates

Re-run clone detector Yes >Accuracy? No Done Yes No Re-configure clone detector

Compare before and after

Figure 8.1: Steps of the tailoring method

To determine the effect of the re-configuration on result quality, the quality engineer compares re- sults before and after re-configuration. More specifically, the quality engineer inspects whether the clone groups considered relevant are still contained in, and whether the irrelevant candidate clone groups are removed from the new detection results. If the improvement of result accuracy is not satisfying, re-configuration and result evaluation is repeated. In case tailoring does not succeed to achieve both perfect precision and recall on the sampled candidate clones, one may be forced to make trade-offs on either precision or recall. From our experience, however, precision can substan- tially be increased without damaging recall (cf., Section 8.7). Furthermore, the case study presented

in Section 8.7 confirms this.

In some cases, the majority of the candidate clone groups in the assessed sample are false positives, e. g., if the analyzed system contains a large amount of generated code. Even if they can successfully be removed in a single tailoring step, a further tailoring round may be required, since the original sample contained too few relevant clones to conclusively estimate precision. In this case, tailoring continues with another assessment (and possibly re-configuration,. . . ) step.

8.2.4 Taxonomy of False Positives

We give a short taxonomy of false positives based on the experiences gathered during clone detec- tion tailoring in several industrial projects. It provides the basis of false positives characterization, which is the prerequisite of clone detector reconfiguration.

No conceptual relationship.The clone candidates are not implementations of a common concept—

no concept change can give rise to update anomalies. Hence, no coupled changes can occur that could result in inconsistencies.

8.2 Clone Detection Tailoring

Inconsistent manual modification impossible.Although a common concept can exist in this case,

consistency of coupled changes is enforced by some means. For example, clone candidates in generated code are, upon change, regenerated consistently; a compiler enforces consistency between an interface and a NullObject implementation. Hence, no inconsistencies can be introduced through manual maintenance.

Artifacts that contain clone candidates are irrelevant. If code, specifications or models are no

longer used, potential inconsistencies cannot do harm—at least, as long as the artifact in question remains out of use.

While the likelihood of their appearance probably differs, these classes of false positives are not limited to a specific artifact type: overly tolerant detection can find clone candidates in code, mod- els and requirements specifications that lack similar concepts; generators are not limited to source code or models, but are also employed to generate requirements specification documents from re- quirements management tools, possibly replicating information.

Importantly, the categories of the above taxonomy are orthogonal to the categorization of clone types for code or models that classify them based on the syntactic nature of their differences [86, 140]: type-1 clone candidates are no more likely to be relevant than type-3 clone candidates, if the file that contains them is no longer used. The crucial information, namely that the file is no longer used, is independent of the syntactic features of the clone candidate. Consequently, we cannot expect the problem of imperfect precision to be solved through the development of better detection algorithms that improve detection for certain syntactic classes. Instead, we need to identify other features to characterize false positives to exclude them.

8.2.5 Characterizing False Positives

Successful tailoring requires the identification of features that are characteristic for (a certain set of) false positives. Once they are known, the clone detector can be configured to handle artifact fragments that exhibits these features specially. Any attributes of source code, requirements specifi- cations or models can, in principle, be candidates for such features. Examples include: the location in the namespace or directory structure; filename or file extension patterns; implemented interfaces or super types; occurrence of specific patterns in the source code, e. g.,This code was generated by a tool. Characteristic ways of structuring, e. g., sequences of constant declarations; identifiers of methods or types; location or role in the architecture.

There is no single, canonic way to determine characteristic features. However, we found that the reasons why developers consider candidate clones irrelevant often yield clues. We give examples for code clones in the following:

Code is unused—it will not be maintained. How can such dead code be recognized? Does it

carry, e. g.,Obsoleteannotations as commonly encountered for .NET systems, or do affected types

reside in a special namespace? If not, can developers produce a list of files, directories, types or namespaces that contain unused code?

Code is not maintained by handsince it is generated and regenerated upon change. Is generated

code in a special folder or does it use a special file name or extension? Does it contain a signature string of the generator? If not, can it be made to do so?

8 Method for Clone Assessment and Control

Code has no conceptual relationship—maintenance is independent. This is typically encoun-

tered if the clone detector performs overly aggressive normalization, effectively removing all traces of the implemented concepts. Code then appears similar to the detector, despite the lack of a con- ceptual relationship that causes change coupling. Typical examples are regions of Java getters and setters or C# properties. Which language or system specific patterns can be used to recognized such code regions?

Compiler prevents inconsistent modifications. Examples are interfaces and NullObject2pattern

implementations of the interfaces. Both interface and NullObject contain the same methods, down to identifiers and types. However, a developer is notified by the compiler that a change to the interface must be performed to the NullObject as well. The fact that the NullObject implements the interface can be a suitable characteristic.

Similar characteristics can often be found for irrelevant clone candidates contained in requirements specifications or models. As detailed in the tailoring case study for cloning in requirements spec- ifications presented in Chapter 5, false positives could be recognized by patterns matching their content or their surrounding text.

8.2.6 Clone Detector Configuration

Clone detector reconfiguration determines the success of clone detection tailoring—accuracy is only increased, if reconfigurations are well conceived. Although automation is desirable, reconfiguration is currently a manual process.

Clone detector configuration incorporates characteristics of false positives into the detection process to remove them from the results. We outline configuration strategies applicable to our clone detector ConQAT (cf., Chapter 7). Again, we give the examples for source code. Similar strategies can be

applied, however, to clone detector configuration for requirements or models.

Minimum clone length prevents the detection of clone candidates that are too short to be mean-

ingful. It has a strong impact on the results. While one-token clone candidates are not very useful, too large values can significantly threaten recall. Still, excluding very short clone candidates is an effective strategy to increase precision without damaging recall.

Code exclusionremoves source code from the detection, and thus prevents detection of clone can-

didates for certain code areas. ConQAT supports file exclusion based on name or content patterns. It also supports exclusion of code regions, which is crucial in environments where some regions of files are generated, whereas the remainder is hand maintained. This is, e. g., found in .NET de- velopment, where the GUI builder generated code is contained in a specific method in otherwise manually-maintained files.

Context sensitive normalization allows to apply different notions of similarity to different code

regions. This way, equal identifiers and literal values can, e. g., be required for clone candidates in stereotype or repetitive code such as variable declaration sequences, getters and setters, or select/- case cascades, while at the same time differences in literals and identifiers are tolerated for clone