The existing slicing techniques based on system dependence graphs [119, 123, 154, 216] have considered C++ programs that are partially object-oriented in nature. In case of object-oriented programs, the programming complexity shifts from method interaction to object relations and communication among objects. The different dependences present in an object-oriented program need to be considered to find the erroneous parts for better program comprehension as they affect the behavior of other components of the program. In this context, it is essential to make a thorough analysis of the dependences between different programming constructs and to detect the critical parts of the programs. To identify these dependences among the program parts, it is essential to model the program with a suitable graphical representation. That’s why we are motivated to consider Java programs for our work that is considered as a true object-oriented programming language.
But the existing slicing techniques cannot be applied to Java programs because of the presence of many new features that increase the dependences among the components of a Java program [44, 85, 116, 130, 198, 215]. The presence of the features like packages, super, dynamic method dispatch, interface, exception han-
dling, multi-threading, etc, in Java add to the list of dependences and thus make the
maintenance even more difficult. Their effects on the maintenance of the programs need to be considered separately.
Apart from this, there are many methods that depend on the type of data they are operating upon. For each type of data, there is a different function. It is essential for the intermediate graphical representation to exhibit all such dependences for an accurate comprehension of the program. Therefore, use of existing slicing techniques to slice the current system dependence graph (SDG) of Java programs, does not seem suitable for regression testing. So, there is a need to have a suitable graphical representation of the Java programs, and a new slicing algorithm that can correctly reflect the ripple effect of the changes.
It is essential to validate the modifications and ensure that no other parts of the program have been affected by the change. Incremental regression testing [2, 207] is a probable solution to validate the changes. Some simple observations related to incremental regression testing are as follows: (1) If a statement is not executed under a test case, it cannot affect the program output for that test case. (2) Not all statements in the program are executed under all test cases. (3) Even if a statement is executed under a test case, it does not necessarily affect the program output for that test case. (4) Every statement does not necessarily affect every part of the program output. We can apply the above assumptions to Java programs at different levels such as packages, classes, methods and statements for an efficient selective regression testing. Instead of exhausting all the test cases to validate every change made to the programs, it is wise to select a subset of the test cases that actually cover the affected program parts. Therefore, an efficient approach for change-based
test case selection is highly necessary for the testers to build the same confidence
as it would have been in case of retest all approach.
For large programs, even the selected test suite can be quite large for the testers to execute with all the test cases. The adverse impact of this retest-all approach even with selected test suite may result in project deadline misses and may incur huge cost while retesting the system for every change. This requires further minimization of the test suite. Therefore, test suite minimization techniques [91, 134, 146] aim to reduce the redundant and obsolete test cases from the regression test suite such that the coverage achieved after reduction is still the same as the initial test suite. Previous work on test suite minimization [28, 57, 105, 131, 208] aimed at developing heuristics for defining the minimization problem. According to the survey in [210], no single heuristic is better than the other, because the heuristic that selects one test case may become redundant for the another. Thus, finding a heuristic that is more relevant to the change in the program is more essential to save the cost of regression testing. Cohesion of the affected components can be computed as an
effective indicator of change(fault)-proneness [8, 9, 55, 213], thus making it a suit- able heuristic for minimizing the test suite. Therefore, minimization problem with respect to regression testing should aim to find the essential test cases concerning the change.
The empirical studies in [61, 67, 175, 176, 210] suggest that the order of test cases execution plays a vital role in detecting faults early in the testing process. An early feedback on the presence of faults can enable the testers to locate the bugs early. It gives an indication to the testers about the test cases that should be exercised first in case the testing has to be prematurely halted. Thus, test case prioritization [66, 69, 104, 108, 113, 148, 162, 163, 175, 177, 184, 214] finds a schedule for the test cases so that if executed in that sequence, it maximizes its effectiveness in meeting some performance goals. Performance goals are the criteria set by the testers based on their expertise and intuition. For example some performance goals can be to maximize code coverage, branch coverage, MCDC [78],
frequency of features coverage, etc. One of the popular performance goals is rate of fault detection. However, fault detection ability of the test cases cannot be known
apriori. Therefore, testers rely on surrogates to overcome the difficulty of knowing the test case that has higher ability to detect faults [67]. The assumption is that early maximization of the surrogate property will enhance the likelihood of fault detection. In many empirical studies [10, 15, 29, 35, 36, 47, 65, 117, 150], coupling
measures are proven to have strong correlation with fault-proneness. But none of
the empirical studies on prioritizing approaches [34, 58, 61, 67, 74, 79, 151, 176, 184, 210, 213] reports the use of the coupling measures to prioritize the test cases.
Though testing is a process carried out to discover as many faults as possible to confirm the quality of the software, but testers are sometimes conditioned to fail. The testers may not have the liberty of exhaustive retesting of every change made to the program in a looming scenario of time and cost (due to project deadline, customer impatience, market pressure, etc.). Therefore, the tester needs to test less without sacrificing the quality [98]. Under such circumstances the tester needs to decide, is it always possible and necessary to validate every change through an equal amount of retesting? The answer to this question is quite intuitive based on Pareto principle that suggests, not all changes will require the same amount of retesting. Thus, the tester has to make a decision about what to test and what not to test, what to test more and what to test less, and also in what order to test. The questions about what to test and what not to test, and in what order to test are answered through regression test case selection and prioritization. But, the answer
to what to test more and what to test less can come only through some mechanism that can quantify the effect of change. This require metrics to be proposed and defined with respect to the changes that are made to the program. To the best of our knowledge, no such metrics has been defined or proposed in the existing literature [1, 17, 76, 77, 96, 121, 126, 168, 169, 178, 180, 192] on change impact analysis.
Thus, the motivations behind our research work on program slicing-based change impact analysis and its application to regression testing are summarized below:
• The features of a Java program add more complexity by inducing many more dependences among its program parts [198]. Thus, the existing graph based regression test case selection techniques are not suitable for Java programs. So there is a pressing need to develop suitable techniques for regression testing of Java programs.
• Many of the regression testing techniques are unsafe, imprecise, and compu- tationally expensive [27, 172]. Very few techniques considering Java programs focus on safe regression test case selection [90, 188]. A safe technique selects those test cases that have high probability of revealing faults. Therefore, it is essential to develop a safe regression test case selection technique concerning Java programs.
• The empirical studies [8, 10, 16, 34, 55, 56, 79, 99, 117, 150, 151, 213] prove the correlation of cohesion measure [7, 14, 40, 46, 82, 118, 149, 218, 219] with the fault proneness of the components, making it a suitable candidate for test suite minimization heuristic. So, a new minimization approach based on the changes and its impact using the cohesiveness of the affected components is desirable to give a concrete solution.
• The existing coupling measures [15, 36] proposed for object-oriented programs are not concerned with the change and its impact. Therefore, a change-based coupling measure, if used for prioritizing the test cases, can promise better likelihood of fault detection.
• It is necessary to test less without sacrificing the quality [98]. But, unfortu- nately no metrics have been suggested that can quantify the effect of a change to help the tester decide what to test more and what to test less. The presence of dependence communities [83] among the changes made to a program can help the testers to save on regression testing time.