Combining Static and Dynamic Impact Analysis for
Large-scale Enterprise Systems
The 15th International Conference on Product-Focused Software Process Improvement, Helsinki, Finland.
Wen Chen, Alan Wassyng, Tom Maibaum
McMaster Centre for Software Certification (McSCert) Department of Computing and Software
McMaster University Hamilton, Ontario, Canada
Outline
1 Large-scale Enterprise Systems Introduction
Characteristics
Changes Are Inevitable 2 Conventional Impact Analysis
Introduction Static Analysis Dynamic Analysis
3 Combining Static and Dynamic Analysis
4 The Approach at a Glance
Large-scale Enterprise Systems
Introduction
Enterprise systems (ES) are large-scale application software packages that support business processes, information flows, reporting, and data
analytics in complex organizations. Types of ES include but not limit to: Enterprise Resource Planning (ERP) Systems
Customer Relationship Management (CRM) Systems Supply Chain Management (SCM) Systems
Large-scale Enterprise Systems
Characteristics Scalable. Complex. Critical. Costly. ExampleTotal number of MODULES in SAP ERP: 241.
Total number of CLASSES in Oracle E-Biz: 230 thousand. Total number of METHODS in Oracle E-Biz: 4.6 million.
Large-scale Enterprise Systems
Changes Are Inevitable
System upgrade
User requirement change Environment change Performance issue
Other customized changes
The latest IT Key Metrics Data from Gartner (gartner12, 2011) report that in 2011 some 16% of application support activity was devoted to technical upgrades, rising to 24% in the banking and financial services
Large-scale Enterprise Systems
Changes Are Inevitable
A well-defined change impact analysis is required to: reduce risks of unintended changes
reduce costs
minimize human efforts focus testing
Software Change Impact Analysis
Introduction
Software Change: Operations{add, modify, delete...} on software entities{function, field, logic, module, database objects...}
Change Impact Analysis: Estimates WHAT will be affected in software and related documentation if a proposed software change is made (Bohner, 1996).
Software Change Impact Analysis
Static Analysis
Static analysis is to
identify a subset of affected elements of the program by analysing the code
abstract all possible software behaviors by graphs (call graph, dependency graph ...) or other static representations
Static analysis is safe and complete, but it often comes up with too large impact sets due to the over conservative assumptions: the actual
dependencies may turn out to be considerably smaller than the possible ones. Additionally, it usually requires long execution time.
Software Change Impact Analysis
Dynamic Analysis
Dynamic analysis is to
identify a subset of affected elements of the program by analysing runtime information
collect dynamic information such as: event traces, test coverages, executions in the fields
Dynamic analysis is precise and efficient, but it often comes up with incomplete analysis due to under-estimation.
Combining Static and Dynamic Analysis
Aspect-oriented programming (AOP)
“The hierarchical modularity mechanisms in object-oriented languages are extremely powerful, but they are inherently unable to modularize all concerns of interest in complex systems.” (Kiczales et al., 2001)
“Aspect-oriented programming (AOP) does for concerns that are naturally crosscutting what OOP does for concerns that are naturally hierarchical, it provides language mechanisms that explicitly capture crosscutting
Combining Static and Dynamic Analysis
AspectJ
AspectJ adds to Java a new concept, joint point, and some constructs: pointcuts pick out certain joint points in the program flow;
After pointcuts pick out join points, we use advice to implement crosscutting behaviour. Advice brings together a pointcut (to pick out join points) and a body of code (to run at each of those join points);
Combining Static and Dynamic Analysis
AspectJ
Inter-type declarations in AspectJ are declarations that cut across classes and their hierarchies. They may declare members that cut across multiple classes, or change the inheritance relationship between classes;
The definition of aspectsis very similar to classes, which wrap up pointcuts, advice, and inter-type declarations in a a modular unit of crosscutting implementation.
Combining Static and Dynamic Analysis
Combining Static and Dynamic Analysis
AspectJ Example Output Sample
Combining Static and Dynamic Analysis
Benefits
integrates with our safe static analysis (Chen et al., 2013); provides precise estimation of impacts;
works at bytecode level;
does not alter system behaviour in any ways; saves efforts in learning the application logic;
The Approach at a Glance
Analysis Overview Enterprise System Atomic Changes (AC) Change Analysis Changes(C) Static Analysis Access Dependency Graph Dynamic Analysis Dynamic Impacts (D) Reverse Search Static Impacts (S) Potential False-Positives (PO) Reachability Analysis Alias Analysis subtract Impact Set (I) union input outputThe Approach at a Glance
Analysis Overview
Steps in our approach include(Chen et al., 2013) (Chen, Wassyng, & Maibaum, 2014):
(i) Static analysis to abstract a representation of the target program P. A full dependency graph G is built for the system at functions level.
(ii) Change analysis to identify direct and indirect changes. The identification of indirect changes may requireString Analysis.
The Approach at a Glance
Analysis Overview
(iii) Graph searching algorithm is employed to extract a static impact set S. The static impact setS is conservative but safe, we will be cutting off false positives from within this set.
(iv) Instrumenting the program P to collect a dynamic impact set D. The dynamic impact setD contains real execution information that we should keep in the static impact set S.
The Approach at a Glance
Analysis Overview
(v) Reachability analysis to filter out other unidentified paths in dynamic analysis that are false positives. Paths taken into account in this step are those that haven’t been executed in dynamic analysis but have the potential of reaching a direct/indirect change. Paths filtered out in this analysis are considered as infeasible paths (mis-matched calls and returns).
(vi) Pointer/aliasing analysis to further filter out unidentified paths. If there isn’t any variable along a particular path aliased to any variable within a changed method, this path can be regarded as a false positive. Different from the infeasible paths identified in reachability
Empirical Study
Target system: Oracle E-Business Suite Version 11.5
Source of changes: Oracle patch # 5565583, 10107418, 14321241 Objective: identify the impact set of the patches
Physical environment: Quad core 3.2GHz CPU, 32GB RAM, 64-bit Red Hat Linux Enterprise version
Empirical Study
Empirical Study
Cont’d
Oracle E-Business Suite V11.5: Number of classes: 195’999
Number of entities (functions and fields): 3’157’947
Patches will be affecting both application tier and database tier. Patches Size Number of direct changes Patch # 5565583 212MB 52’870
Patch # 10107418 10KB 0 Patch # 14321241 99MB 230’209
Empirical Study
Empirical Results
Oracle E-Biz Numbers
Classes 195’999
Entities 3’157’947
Static dependencies 18’387’466 Dynamic dependencies 8’200 Reduced dependencies after
reachability analysis and alias-ing analysis
Empirical Study
Results
Empirical Study
Empirical Results
Patches 5565583 10107418 14321241
Size 212MB 10KB 99MB
Number of direct changes 52’870 0 25’114
Affected functions 699’534 0 230’209
Affected functions % 22% 0% 7.3%
Affected top functions 160’800 0 69’971 Affected top functions % 5.1% 0% 2.2%
Static Analysis
0 16.3 66.3 88.3 hours
Dynamic Analysis Reachability and
Alias Analysis
Summary
Achievements
We have developed a multi-tasking, aspect-oriented instrumentor to adequately instrument large-scale systems and collect traces at bytecode level.
We have successfully combined static analysis and dynamic analysis. Static analysis was used as the input to dynamic analysis, providing a safety guarantee whenever full potential impacts are needed.
We have empirically demonstrated the practical applicability of the improved approach on a very large enterprise system involving hundreds of thousands of classes. Such systems are perhaps 2 orders of magnitude larger than the systems analyzed by other approaches.
Summary
Future Work
Running time still needs to be improved;
The impacts identified by dynamic analysis was only a small portion of the static impacts (0.015%), though they were executed hundreds of thousands of times.
Bibliography
Bohner, S. A. (1996). Software Change Impact Analysis. InProceedings of the 27th annual nasa goddard/ieee software engineering workshop (sew-27’02).
Chen, W., Iqbal, A., Abdrakhmanov, A., Parlar, J., George, C., Lawford, M., . . . Wassyng, A. (2013). Large-scale enterprise systems: Changes and impacts. In Enterprise information systems (Vol. 141, p. 274-290). Springer Berlin Heidelberg.
Chen, W., Wassyng, A., & Maibaum, T. (2014). Impact analysis via reachability and alias analysis. In U. Frank, P. Loucopoulos, . Pastor, & I. Petrounias (Eds.), The practice of enterprise modeling
(Vol. 197, p. 261-270). Springer Berlin Heidelberg. Retrieved from
http://dx.doi.org/10.1007/978-3-662-45501-2 19 doi: 10.1007/978-3-662-45501-2 19