Information needs Frequency
7.4 Program comprehension and tools
7.4.1 Static analysis techniques
Static analysis techniques commonly involve the parsing of the application's source code to generate a wide range of data, including call graphs, data and control ows, structure charts, and cross-reference information. These techniques can also be used to assist in code improvement activities such as pretty printing and anomaly detection, and test path, impact, and complexity analysis.
Commercial tools that are considered to be static analysis tools include Verilog's Logiscope and McCabe and Associates ACT, BAT, and CodeBreaker.
Most commercially available maintenance tools provide a subset of the mentioned capabilities.
The oerings vary not only in the specic user interface and features provided, but also according to the internal representation used for information derived from the source code.
In general, the range of functions performed by the tool is related to the internal data rep-resentation utilized; the more functions provided, the more complex the underlying data storage mechanism. Tools utilizing simpler databases commonly perform fewer functions and provide a lesser capability for navigating between the views supported by the various static analysis func-tions.
Some tools which oer static analysis techniques also perform source code editing and debugging, and store information derived from parsing in a complex repository. These tools are often marketed as maintenance workbenches and generally provide a wide variety of capabilities. In addition, they also provide sophisticated browsing capabilities to allow the programmer to switch between various views provided by the tool. Maintenance workbenches are considered elsewhere.
7.4.1.1 Support for program comprehension
The value of the individual techniques (e.g., call graphs, data ows, cross-references, etc.) based on static parsing of the code has not been studied extensively by researchers. The few such studies that were undertaken have demonstrated improved program comprehension through use of particular static analysis techniques. These studies tend to be carried out using small code samples that are not representative of the larger systems commonly being maintained. For example, [Ryder 79,87]
found improved program comprehension by professional programmers working with a relatively small code segment. This improved understanding was attributed to use of statically generated call
CHAPTER 7. A MAINTENANCE PERSPECTIVE 93 graphs.
Many of the research eorts in this area assume that the various capabilities are of value, and focus on the development of enhanced support. Little experimental verication of this original assumption of value is available. Other research eorts, such as [Linos 94], provide a \grab bag" of capabilities and then measure the aect on maintenance. These studies commonly show improve-ment in program comprehension among both experts and novices when presented with this \grab bag," but there is little evidence to suggest which capabilities are particularly useful.
The majority of these studies suer from what can be considered a severe methodological aw.
Commonly, the studies measure the performance of maintainers using the \favored" tools against the performance of maintainers without tool support. However, this is an unrealistic situation.
Maintainers commonly have support available from simple tools like string search engines (e.g.,
\grep") and compiler or editor-centered cross-referencers. No studies could be found where re-searchers compared the performance of experts (or novices) using \specialized" support to a similar group using common tools and methods.
In spite of the lack of supporting data, it is the impression of this author that information garnered from static analysis (regardless of whether that information is provided by simple or complex tools) can provide useful support for facilitating program comprehension. General benets derived from the data produced by static analysis tools may include:
direct support for constructing the program model;
focusing of maintainer attention on potential \beacons" within the source code;
indirect support for constructing situation and top-down models.
Static analysis tools may be particularly important in assisting inexperienced maintainers or maintainers unfamiliar with the software to develop a program model of the system. Many re-searchers have suggested that the program model is the rst model developed when a maintainer is presented with an unfamiliar system. Static analysis tools are likely to assist in construction of the program model.
Static analysis tools can help by exposing underlying structure, algorithms, and signicant data items by highlighting these items while hiding program detail. This may in turn decrease the memory demands placed on maintainers, and increase the likelihood that higher level structures, algorithms (programming plans) and domain constructs will be recognized within the code.
However, it is possible that experienced maintainers benet less from the wide range of functions provided by static analysis of source code. [von Mayrhauser 93] suggests that a top-down approach
CHAPTER 7. A MAINTENANCE PERSPECTIVE 94 is primarily active when the code or type of code (the domain) is familiar. If this is the case, we would expect experienced maintainers to be less interested in the algorithmic and data ow detail provided by static analysis tools.
[Joiner, Tsai, Chen, Subramanian, Sun, and Gandamaneni, 94] provide indirect support for this view. The authors interviewed experts from various organizations maintaining millions of lines of source code. The experts indicated that they preferred that a tool provide coarse highlighting of relevant program units rather than detailed data and control ow information. Even with the presence of statement level highlighting of control ow, the experts prefer to manually inspect the code within suspect modules, in order to understand the context of highlighted statements or variables.
Thus, while static analysis information may be important to both experts and novices, experts may prefer less-detailed models over more detailed models. This suggests that experts may nd little value in the ne-grained capabilities provided by static analysis tools.
Static analysis tools may also assist the maintainer by highlighting particularly important vari-ables, routines, and algorithms within the source code. These \beacons" can identify common tasks (such as a pattern of data indicating a sorting algorithm). They can provide important information about the maintainer's currently active hypotheses, or lead to the formation of new hypotheses.
Beacons can also help the maintainer in switching between the program, situation, and top-down (domain) models by highlighting identiers and other program constructs that serve to \jog" the maintainer's memory.
In general, the value of a particular tool in highlighting beacons that assist the maintainer will depend on two closely related factors:
1. the ability of the tool to distinguish information that can potentially serve as beacons;
2. the program, situation, and domain knowledge of the maintainer.
The view expressed by a specic tool function (e.g., control ow, data ow, etc.) is in itself one attempt to isolate critical characteristics (or beacons) in the program. However, for even moderate-sized applications, the volume of information displayed by these tool functions can be quite large.
Vendors have recognized this problem and are attempting to overcome it by providing user control over the focus of information displayed by the tool.
The signicance of the hypothesis-testing strategy employed when comprehending source code is hard to overstate. Regardless of the tools ability to focus on specic information, the impact of the user's current knowledge is overwhelming. The particular information that will serve as a beacon
CHAPTER 7. A MAINTENANCE PERSPECTIVE 95 to a given maintainer and situation is dependent on program, domain, and situation knowledge, the preferred style of the maintainer, and the software under study.
The implication for tool support is that no tool or technique will be correct for all situations and maintainers. The best users can hope for is a tool that is suciently exible to support ad hoc queries that address some of their needs. Alternately, they can pursue special purpose tools that perform a single function well. Thus, tools like performance prolers and memory leak detectors that only provide a narrow range of support addressing a specic need can be extremely valuable in certain circumstances.