Chapter 3 Security Driven Software Evolution Approach
3.2 Framework of SEMDA Approach
3.2.1 Legacy System Understanding for Security
Evolving legacy system is not easy and often encounters many problems due to incomplete or absence of original design information. A successful evolution of legacy systems depends on proper comprehension of their functionalities, contexts and architectures. To achieve this purpose, software reverse engineering techniques are usually used to extract high level diagrams from source codes.
3.2.1.1 Choices of Models
The first step of the proposed framework is to extract higher level models of the legacy system for the later security based transformation purposes. A legacy system has static and dynamic characteristics that display its functionality and represent its structural and behavioural characteristics respectively. The modelling of a legacy system concentrates on the reflection and comprehension of the legacy system at higher level of abstraction. The main purpose is to understand the structure of the target legacy system and its main tasks.
UML has proved to be a good platform for modelling real systems. When modelling a legacy system with UML, the information in the legacy system is refined by using the UML diagrams. Through the extraction of UML diagrams from legacy code, the transformation has realised analysis platform on UML in order to be helpful on the comprehension of legacy systems based on the general analysis language UML.
To evolve an existing legacy system, both static and dynamic information are useful. Static information describes the structure of the software while dynamic information specifying the runtime behaviour. Static analysis along with dynamic analysis for the legacy system contributes to the various software artefacts and their relationships.
UML has static and dynamic modelling advantages. It satisfies the needs of software evolution. At the same time, UML presents a visual description of the system and makes the process of software evolution easily acceptable. Meanwhile, a large number of tools support the transformation from UML diagram to code, and UML facilitates the reusability of software evolution, which is also helpful for forward engineering in the process of reengineering.
3.2.1.2 UML Model Extraction
Software evolution is supported by producing design models from the legacy software. The software evolution approach is useful when building legacy software into high-level information. The extracted models are utilised to get an overall picture of the current state of the legacy software. The dynamic models are used to support tasks such as understanding the current behaviour of the legacy software.
As the first step of whole framework, the task of this phase is to extract UML diagrams from legacy source code. When it comes to reengineering legacy system, source code is thought of as the most reliable part to be modified for satisfying the new requirements. Traditional reengineering methods rely on software representation techniques such as data and control flow diagram, Abstract Syntax Tree (AST), or UML class diagram to represent the various software aspects and the interrelationships between them.
There are two major stages of UML extraction from legacy systems as being structural and behavioural.
• The structural stage contains UML structural or static diagrams extraction. UML 2.0 uses six diagrams to model the static parts of software system, which are class diagram, object diagram, component diagram, deployment diagram, package diagram, composite structure diagram.
• The behavioural stage includes the UML behavioural or dynamic diagrams extraction. UML 2.0 uses seven diagram to model the dynamic parts of the software system, which are communication diagram, timing diagram, use case diagram, sequence diagram, activity diagram, state machine diagram and interaction overview diagram.
In this thesis, class diagram and sequence diagram are chosen to represent static model and dynamic model of the legacy system and extracted by using existing extraction toolset.
3.2.1.3 UML Model Slicing
With the increase in products’ sizes and complexities, UML models extracted from the source code are likely to become large and complex. It is possible that hundreds of
objects are involved in thousands of interactions which make the extracted models from such large architecture harder to read and understand. Moreover, it tends to be tedious and poor readability on one side and it is valuable to judge the impact of a certain change of one model elements on other parts on the other side [96].
Different UML diagrams represent different system views. For a better understanding of the extracted UML diagrams, especially for security related analysis, it is necessary to slice the diagrams. Different from program slicing, model slicing aims to reduce the legacy system view at the model level.
In order to slice the UML models, different dependency relationships among classes need to be taken into consideration, which are relation dependency, operation dependency, control dependency, data dependency, call dependency and message dependency. Moreover, an intermediate representation of diagrams is constructed based on the defined class dependency relationship and sliced according to the slicing algorithm and appropriate slice criteria.
3.2.1.4 System Partition
The proposed approach aims to extract the reusable legacy components from the underlying legacy system. In this context, method and process are needed to partition existing systems into notable collections of components, each of which potentially implements an object. However, legacy systems are huge and usually the packaged systems that composed by rich and old structures. To address this problem, a specific decomposition method is developed based on the class dependency analysis and model slicing technique.
There are two main challenges to be taken up, one is how to determine the cohesion degree in a cluster, and the other is what kind of intermediate diagram can be used to facilitate the partition. After examining the dependency relationship among classes and objects in UML diagrams, the proposed CSDG graph is used as the intermediate representation to serve the system partition and each type of edges in CSDG is weighed as the parameter to decide the cohesion degree. Moreover, a decomposition algorithm is proposed to search high independent clusters on the basis of CSDG and independent metric.
3.2.1.5 Security Mechanisms Detection
The security mechanisms detection process is to identify the existing security countermeasures used in the legacy system so as to conduct a fully evaluation towards security level of the target system. A method to conduct security countermeasures detection for legacy system is proposed which is divided into two main parts.
• Extraction phase. Reverse engineering tools are used to extract and gather all relevant information from the legacy system under evaluation.
• Identification phase. This phase inspects each of the gathered information in order to determine whether it is relevant with the security artefacts listed in Security Artefacts Base which stores a list on possible security issues, and is created and maintained by security expert. The results of the identification phase are a mapping list showing whether or not there are any security artefacts in the legacy system and what types of security countermeasures they belong to.
After this stage, system security analyst can make an evaluation if the existing security mechanisms are enough to meet the user’s security objectives and based on which a decision can be made whether or not a security evolution is needed.