Conclusions
8.1 Tool Architecture
8.1.1 Language Model
Since reverse engineering techniques span over a wide spectrum, depending on the kind of high-level information being recovered, it is quite important to design a general language model that supports all of the alternative algo- rithms. In turn, each algorithm may have an internal representation of the source code, different from the language model itself. However, the main re- quirement on the language model is that all the information necessary for the reverse engineering algorithms to work and (possibly) build their own internal data structures must be available in the language model. Thus, the language model plays a critical, central role in the architecture described above and should be designed very carefully. An example of such a model is given in Fig. 8.2 for the Java language. Only the most important entities are shown (for space reasons), with no indication of their properties.
A Java source file contains the definition of classes within a name space called package. In turn, packages can be nested. Thus, the topmost entity
Fig. 8.2. Simplified Java language model. Containment and inheritance relation- ships are shown.
in the language model for Java (see Fig. 8.2, left) is the package and a self- containment relationship in the package entity represents nesting. Eventually, packages contain classes (containment from package to class in Fig. 8.2). The main property of the entity package (not shown in Fig. 8.2) is its name, that uniquely identifies it.
The properties of the entity class include the name, visibility, as well as its superclass, implemented interfaces, etc. The entities in turn contained inside classes are the class members. Thus, the entity class is connected to the entity attribute and to the entity method. Moreover, classes can be nested inside other classes. This is the reason for the self-containment outgoing from the entity class.
The entity attribute has properties such as name, type, visibility, initializer, etc. Similarly, the entity method has properties such as name, formal param- eters, return type, visibility, etc. The body of each method is represented as a sequence of statements in the language model (containment from method to statement labeled body in Fig. 8.2).
Statements can be of different types. Some of them are enumerated in Fig. 8.2, connected to their abstraction statement by an inheritance relation- ship. Conditional statements are used for constructs such as if and switch. Among their properties, they hold a reference to the expression entity used in the tested condition (not shown in Fig. 8.2). The if conditionalstatement has a then-part and an else-part, which are in turn sequences of statements (similarly to the body of a method). The switch statement is associated with a sequence of cases, each containing the respective statements to execute.
Loop statements includewhile,for anddo-while loops. Their main prop- erties are the tested condition (an expression entity, not shown in Fig. 8.2) and the loop body (a sequence of statements). For loops have also an initializer and an increment part.
Assignment statements have two main components, the left hand side and the right hand side. While the latter is a generic expression, the former must eventually reference a location. This is achieved by constraining it to a unary expression, instead of a generic expression.
8.2 The eLib Program 159
Call statements involve a dereference chain (primary expression), eventu- ally leading to the object which is the target of the invocation. Other impor- tant properties are the name of the called method, the actual parameter list (a list of expressions), and links toward all type-compatible methods in the language model. In the case of an invocation of a library method, the call is marked as library call.
When the control flow inside a method is interrupted to return a value to the caller, a return statement is encountered. The main property of this entity is the expression that defines the returned value.
Among the entities and relationships not shown in Fig. 8.2 for space rea- sons, the most important one is the entity expression, accounting for all math- ematical expressions supported by the language, possibly intermixed with method invocations. The sub-hierarchy of the expression entities closely re- sembles that available in most programming languages (either procedural or Object Oriented).
The information represented according to the model in Fig. 8.2 is sufficient to build the OFG for a given source code, as well as to conduct all other analyses that do not depend on the OFG and have been described in the previous chapters. Thus, it can be used as the basic representation exploited by all reverse engineering techniques implemented in the Reverse Engineering module.
8.2 The eLib Program
The change request for the eLib program, anticipated in Section 1.2, is recon- sidered now that several design views have been recovered from the eLib code and are available for inspection.
In summary, the modification to be implemented involves the following issues:
The program should support the reservation of books not available for loan (i.e., borrowed).
A document can be reserved by a user if it is currently borrowed by an- other user and if no other user has already reserved it (one reservation per document only).
Permission to reserve a document follows the same policy used for the loans: only users that are authorized to loan a given document can reserve it when it is out.
When a reserved document is returned to the library, only the user who made the reservation can borrow it.
Reservations can be cleared at any time (both before and after a document is returned).
The design diagrams extracted from the code in the previous chapters are used to locate the code portions to be changed and to define the approach to
implement the change, at a high level. Then, design diagrams are recovered from the new system, to assess the portions of the system actually impacted by the change. These are expected to be the main target of the testing activity to be conducted before releasing the new version of the program.