The notion of software evolution is mainly associated with a specific software type known as E-Type [111]. This section describes E-Type software, and describes how it is distinguished from other software types.
It has been a view that software evolution is associated with large systems. In this view, a system is considered large if it includes more than an arbitrary number of source code lines. Lehman [113] was critical of this view on the grounds of its arbitrariness, and he believed large systems should be identified by the ways in which they are designed, developed and maintained. To address this concern, Lehman proposed a new classification [114] based on the realisation that there is a fundamental distinction between the evolution of systems which are implemented from a formal specification, and the ones that are developed to be part of day to day activities. He proposed three software types1 as follows:
S-Type Programs: Lehman defined a program as Type S “if it can be shown that it sat- isfies the necessary and sufficient condition that it is correct in the full mathematical sense relative to a pre-stated formal specification” [111]. This definition assumes that 1
Lehman [111] uses program and software interchangeably with more emphasis on programs. In this section,
the problem (requirement) can be formally specified prior to the implementation, the problem can be solved using an algorithmic method, and it is feasible to prove that the program is correct against the formal specification. These assumptions limit the domain of S-Type programs to mathematical applications, or formally defined transformations such as compilers.
E-Type Programs: Lehman initially defined a program as Type E if it mechanises a human or societal activity [114]. This definition was subsequently amended to “all programs that operate in or address a problem or activity of the real world” [111].
A characteristic of E-Type programs is their integration in a domain. Changes in their domain raise new requirements for these programs and necessitate their evolution with respect to their environment. Hence, software evolution is a direct consequence of the nature of E-Type programs, and one can not expect them to remain static.
P-Type Programs: Lehman defined this class as an intermediate between S-Type and E- Type [111]. The programs in this class address problems that can be fully specified, but the users are concerned with the execution results rather than validating the im- plementation against its specification.
An example of this type is a program that plays chess. The rules of the game can be fully specified; however, the decision tree at any given stage of the game is too large to be scanned by a personal computer, hence, the program must provide an optimum approximation of a good decision given the limited resources. A chess program is valued by its performance, not by validation against the specification.
Cook et al. refined this classification with an emphasis on the role of stakeholders in the evolution of system requirements [41]. Their classification is derived from the Kuhn’s concept of normal science [105] and the concept of paradigm [132]. Kuhn explains that development of scientific knowledge consists of successive periods of what Kuhn called “normal science” that each take place within a paradigm [41]. In this view, a paradigmatic domain contains a stable and well structured body of knowledge. This implies that an analyst must use method- ological hermeneutics2and the baseline model of the domain to validate the requirements. In contrast, non-paradigmatic domains lack such a rigid knowledge structure, and consequently the requirements are open to objective interpretation.
Cook et al. [41] argue that E-Type programs are situated in non-paradigmatic domains. In such domains, sources to derive domain knowledge are less extensive and less reliable, and validation of requirements often wholly relies on interpretation of stakeholders’ statements. This implies that stakeholders can define and redefine problems without any paradigmatic constraints, and the scope of the system is open to reinterpretation.
Type-P has been redefined as “Paradigm-based” programs which address problems in paradig- matic domains [41]. The evolution of these programs is restricted to changes in their paradigms, and the change to the system is constrained by the stakeholders’ decision to keep the system consistent with the domain.
Type S programs are somewhat different. Cook et al. [41] argue that these programs do not evolve. Once the requirements are specified, then these programs should detach from the paradigm, and they will no longer be affected by the changes in their domain.
This thesis does not consider S-Type and P-Type programs, and it focuses only on E-Type programs where lack of rigorous specification makes the validation of software changes, a challenge for software maintainers. In these systems, often the implemented program is the only actual model which can provide reliable information about the potential impact of software changes. The uncertainty in requirements for E-Type systems makes them the default case of software evolution.
The laws of software evolution and how they are applied to E-Type systems will be further discussed in the following section.