Change Impact Analysis - Information Needs for Change Characterization

3.4 Information Needs for Change Characterization

3.5.3 Change Impact Analysis

The integration of a change into a system requires prior assessment of the potential impact of that change on the system’s behavior. The behavior can be greatly impacted when a change made for a branch is merged with another branch in the case that both branches differ in purpose or design. Here this assessment is required.

Impact analysis is among the major issues related to software change management. Understand- ing the impact of changes has been considered in areas including software maintenance, program refactoring [Mens 2004] and test prioritization [Elbaum 2000].

Program slicing approaches. Program slicing [Weiser 1981] provides an in-depth analysis of impact of the changed code. Several approaches, such as CodeSurfer [Anderson 2001], use pro-

gram slicing to understand which part of a program is impacted by a variable, and which are the parts of a program that can impact a particular variable or a given source-code element in gen- eral [Bohner 1996, Tip 1995, Welser 1984]. In the context of software maintenance, such program slicing techniques have been widely adopted [Gallagher 1991].

Conceptual Framework. Ryder and Tip [Ryder 2001] proposed a conceptual change impact analysis for object-oriented programs in terms of affected regression or unit tests [Elbaum 2003]. The authors introduce techniques that could determine the tests that are affected by a set of changes, and the subset of changes responsible for the failure of each affected test. They propose to perform the change analysis on (coarse-grained) atomic changes extracted from source code edits (e.g., added empty class, added empty method) and their syntactic dependencies by ordering atomic changes (e.g., a methodmcan be added to classXif that class exists).

Chianti. Ren et al. [Ren 2004] proposed Chianti as the implementation and extension of the pro- posal of Ryder and Tip [Ryder 2001]. Chianti is a (semantic) change impact analysis tool that decom- poses the difference between two versions of a Java program into a set of extended atomic changes (e.g., changed definition of an instance initializer) and their interdependences. They use a pairwise comparison of the abstract syntax trees of the classes in both versions. Change impact analysis is performed on the (dynamic) call graphs derived from a set of regression or unit tests applied to both versions. Chianti determines the affected tests whose behavior have been modified by the applied changes, and the affecting changes for each affected test. Chianti is one of a large category of approaches related to test prioritization [Elbaum 2000], and serves as the foundation for several approaches that we explain below.

Crisp. Chesley et al. [Chesley 2005] presented Crisp, a tool that assists developers in isolating relevant subsets of changes that directly cause the failure of a regression test. Crisp leverages and augments Chianti [Ryder 2001] to detect failure-inducing changes between two versions of a Java program. It allows developers to build compilable intermediate versions of a program with partial changes that can be applied to the original version to ensure compilation in order to locate the exact reason for the failure. This approach was improved in [Ren 2006] where the original dependence relationships were refined. Three kinds of dependences between atomic changes that capture syntactic and partially semantic dependences: (a) structural dependences capture the necessary sequences that occur when new elements are added or deleted in a program; (b) declaration dependences capture all the necessary element declarations that are required to create a valid intermediate version; and (c) mapping dependences are implicit dependences introduced by atomic changes such as overloading methods that may affect the behavior of a method despite the fact that no textual changes occur within that method.

JUnit/CIA. Stoerzer et al. [Stoerzer 2006] proposed a change classification tool, JUnit/CIA that helps programmers to find failure-inducing changes (between two versions of a Java program) according to the tests that the changes affect. They rely on the change impact analysis tool Chianti [Ry- der 2001] to extract atomic changes, affected tests and affecting changes. Then, they classify changes

3.5. State-of-the-Art 49

as Red (changes are highly likely to be the reason for the test failures), Yellow (changes are possi- ble problematic), or Green (changes are correlated with successful tests) according to five classifiers. These classifiers are based on the JUnit test result model (pass, fail, crash). For coverage issue, their change classification techniques also classify changes that do not affect any test as Grey.

Celadon. Zhang et al. [Zhang 2008] proposed the tool Celadon, a change impact analysis framework that uses an atomic change representation to capture the semantic differences between two versions of an AspectJ program. They built a change impact model based on static AspectJ call graphs to determine the affected program fragments, affected tests and their responsible affecting changes. The change impact model relies on the computation of atomic changes and their inter-dependence relationship. The authors defined a catalogue of 21 types of atomic changes (e.g., changed pointcut body) for AspectJ programs. They extended the concept of atomic changes proposed in [Ryder 2001] to aspects.

Safe-commits. Wloka et al. [Wloka 2009] presented Safe-commits, an analysis-based algorithm to identify committable changes that can be submitted early, without causing failures to existing tests in the repository, even in cases when failing tests exist in a developers’ local code base. The idea is to decrease the time interval between commits, by establishing three commit policies (Restrictive, Moderate, and Permissive) that depend on the test result model and enable developers to release their changes often. This approach relies on the data generated by Chianti [Ren 2004]. Safe-commits takes into account all atomic changes, the affected tests (according to a commit policy), the exercised changes by each test (changes used by a test), and the covered changes by each test (exercised changes by a test that are applied to their dependencies as well). The output of the algorithm is a set of changes that do not break existing tests and can be committed.

Reuse Contracts. Lucas and Steyaert [Lucas 1997, Steyaert 1996] introduced reuse contracts as an object-oriented methodology to assist software engineers in understanding how a component can be reused, adapting components to particular needs, and estimating the impact of changes. With reuse contracts a component is reused on the basis of an explicit contract between the provider of the component and a reuser that modifies this component. The provider documents how the component can be reused, and the reuser documents how the component is reused or how the component evolves. Their contract clauses allow to detect what the impact of changes is, and what actions the reuser must undertake to upgrade if a certain component has evolved. Reuse contracts help in keeping the model of the provider consistent with the model of the reuser. They were used at the implementation level to express reuse in evolvable class inheritance hierarchies [Steyaert 1996], and reuse and evolution of collaborating classes [Lucas 1997].

Other approaches

Kung [Kung 1994] categorized various types of changes of object-oriented systems and a formal model for capturing and inferring the impact of class library changes to identify affected components. Several techniques aim at identifying the so-called fragile bass class problem [Mikhajlov 1998], that arises when changes in a framework have unexpected impacts on framework instantiations. For ex-

ample, changing the visibility of a method from protected to private may break classes that extend the framework.

Han [Han 1997] proposed an approach to support impact analysis and change propagation in software engineering environments. This approach is focussed on how the system reacts to a change. They use the environment representation of artifacts (variables, method, classes) and their dependencies (association, aggregation, inheritance, and invocation) that can be impacted. Later, Abdi et al.[Abdi 2006] reused this work for their change impact analysis. Chaumun et al. [Chaumun 2002] defined a class-based change impact model. Their approach is similar to the previous ones but the model is more complete and systematic. The impact of a change is calculated to ensure that the system will still run correctly after the change is implemented.

Alam et al. [Alam 2009] used change dependency graphs [German 2009] to examine how changes build on each other over time and determine the impact of these changes on the quality of a project. The authors showed that time dependences vary across projects and throughout the lifetime of each project. They also found that changes built on top of new code (instead on stable code) are more defect prone.

Herzig [Herzig 2010] introduced the concept of transaction dependency graph based on the notion of change genealogies defined by Brudaru and Zeller [Brudaru 2008]. This is a similar approach to the change impact graphs by German et al. [German 2009], but differs considering version con- trol transactions instead of atomic changes to define multiple dependency metrics on these change genealogy graphs.

Law and Rothermel [Law 2003] defined a new technique – PathImpact – for impact analysis based on dynamic information obtained through simple program instrumentation. They execute a program with a set of inputs, collecting compressed traces for those inputs, and using the traces to predict impact sets. PathImpact does not rely on availability of program source code and does not require static dependency calculations.

Badri et al. [Badri 2005] propose a change impact analysis based on a call graph for making impact predictions. This technique restricts the scope of the analysis by only considering methods within the reachable paths of the call graph. Abdi et al. [Abdi 2009] propose a technique based on a probabilistic model, where a Bayesian network is used to analyze the impact of a given scenario.

Discussion

Several approaches based on the conceptual framework presented by Ryder et al. [Ryder 2001] and extending Chianti [Ren 2004] have proposed change impact analysis using an atomic representation of changes and the notion of dependencies between these changes. They have been used to identify failure-inducing changes, to capture the semantic differences on the context of aspect programming, and to identify changes that can be safely committed to decrease the interval of time between commits. They makes use of a change representation and dependencies which we intend to do. However, they can only compare two versions of a program. No history and streams of changes is supported in the presence of branches.

Reuse Contracts have been proposed for understanding how components can be reused, adapting components and estimating the impact of changes. Other approaches have been applied in several contexts: to infer the impact of change library changes, to calculate the impact of changes and support

3.5. State-of-the-Art 51

change propagation, to examine how changes build on each other and determine the impact in terms of quality, to perform dynamic impact analysis, and to predict the impact of changing methods using call graphs.

While there exists an extensive and varied work on change impact analysis, none of these approaches is dedicated to assess the impact of changes as a means to support integrators deciding which changes should be integrated. Especially to perform cherry picking of changes between multiple branches that may have substantially evolved apart. These approaches are complementary to our work and form a good foundation for providing change impact analysis to assist the integration process.

In document Supporting Integration Activities in Object-Oriented Applications (Page 70-74)