Chapter 1. Introduction
1.1 Motivation: structure and maintenance
This section makes a case for the importance of knowing which source code characteristics affect longevity of an application. This section defines maintenance, its importance, the difficulties of maintaining an application, and how some of these difficulties can be dealt by using an appropriate structure for the application. Finally, this section argues that not only the design, but also the source code may help to facilitate maintenance.
This thesis uses the terms software maintenance and software evolution in the same way as [Bennett '00]. Software maintenance is the phase of the software lifecycle when the system is modified after its first delivery [Bennett '00]. The attributes of an application that bear on the effort needed to make modifications is called maintainability [Sanders '95]. Software evolution is the first sub-phase of maintenance, in which the application undergoes major changes like the addition of new functionality or adaptation to new environments [Bennett '00]. In order to refer
to the life of the application (called evolution by Lehman and Belady [Lehman '85]) we will use the term software history. Finally, to refer to the way in which a property changes over time we call it the evolution of the property.
1.1.1
On the importance of software maintenance
Software systems are expected to have a long life due to their cost. Successful applications fulfill user's requirements. In order to keep fulfilling the requirements of its users, the application requires modifications to fix bugs, to implement requirements that were not foreseen, to enhance functionality for a changing environment, or to implement requirements that ripened with the use of the system [Lehman '85].
Software maintenance is the longest and most expensive stage of the software lifecycle; therefore, software maintenance is a critical stage. Besides, the better an application is evolved; that is, the better major changes are integrated to keep the design intuitive and self-documented, the longer its maintenance phase will be. In consequence, software evolution is a critical phase of software maintenance. In fact, some studies have found that more than half of professional programmers' time [Lientz '81; Singer '98] is consumed on maintenance tasks. Furthermore, more than 40% of maintenance effort is spent on enhancements and extensions [Lientz '81] , that is, on the evolution phase .
Therefore, if concrete properties that reduce or increase maintenance in the long term are identified, the costs of software development can be significantly reduced. Besides, such properties could provide insights in the research of software quality.
1.1.2
Difficulties of achieving long term maintenance
The difficulty of performing maintenance tasks increases with the age of application, and with the amount of maintenance tasks performed on the application. This section discusses the relation between the age of an application, and the difficulty to perform changes on it.
1.1.2.1 Key elements to achieve long term maintenance
The evolution phase is facilitated when there is software architecture integrity and software team knowledge, as “they allow the team to make substantial changes without damaging the architectural integrity” [Bennett '00]. A less coherent architecture requires more extensive knowledge in order to evolve it, and a lack of knowledge results in a faster deterioration of the
1.1. Motivation: structure and maintenance
architecture [Bennett '00]. However, neither the integrity of the architecture nor the knowledge about the system remains as the system evolves.
1.1.2.2 Difficulties of maintaining the system’s knowledge
The loss of system’s knowledge is difficult to avoid for two reasons: first because documentation is inaccurate, and second because the knowledge of programmers about the application is not kept.
From early studies, documentation has been found to be one of the top maintenance problems [Lientz '81]. Inadequate documentation affects maintenance. Basili and Perricone [Basili '84] found that misunderstanding of a module’s specifications or requirements, and mistaken assumptions constituted the main factors on provoking defects when modifying existing code. The recording of design rationale is a complex task. Therefore, documentation tends to have several issues:, it is poorly organized [Parnas '94], imprecise [Parnas '94], incomplete [Parnas '94], it has uneven coverage and in some cases it is incoherent. Even if the documentation is used, usually it is not trusted [Singer '98], so it is likely that the only or most reliable source of information about an application is its source code.
Besides, as time passes, programmers leave the project taking with them the domain and system knowledge they have acquired. When maintainers do not know the original design they might have difficulties of agreeing on the system's components and their relations, or grasping how the whole application works [Parnas '94]. The lack of knowledge may be more frequent on organizations that have separate developer and maintenance teams, or that have a high turnover in the maintenance team.
1.1.2.3 Difficulties of maintaining the architecture integrity
The loss of architecture integrity seem to be an inevitably result of maintenance. Such loss of architecture integrity makes future maintenance more difficult. Evolving an application requires introducing changes without introducing defects. Nevertheless, accidental defects might be introduced every time the application is changed. Therefore, the cost of a change can grow exponentially with respect to the system's age [Lehman '85]. Besides, if changes do not comply with the original design contracts, the design degrades [Parnas '94]. An eroded design implies that neither original programmers nor maintainers have enough knowledge about the system as a whole [Parnas '94]. Therefore, the system becomes expensive to update because “changes take
longer and are more likely to introduce new bugs” [Parnas '94]. Given that these events are directly related with the applications’ age, it is said that the software is aging1 whenever the application becomes incapable to accommodate changing needs[Parnas '94], i.e. when it can only accommodate minor changes. There is some empirical evidence that supports theory of degradation in the architecture. For instance, changes lose locality over time, this may indicate that the modularization degrades, as it is not hiding changes behind abstractions [Eick '01]. Furthermore, the degree of connectivity inside and across software components, and inside and across abstraction layers also increases over time [Bianchi '01]. Moreover, the minimal path of objects needed to traverse to access a second object increases over time [Burd '99].
Keeping the architecture integrity over time is known as a n t i - r e g r e s s i v e w o r k [Lehman '85]. The goal of anti-regressive work is to keep the design intuitive and self documented, so that it remains easy to understand and therefore easy to change. Anti-regressive work is to software as waste collection, recycling, pest control, research in alternative energy sources, etc. is to cities. [Lehman '85]. This means that, anti-regressive work avoids degradation but it does not add, adapt or enhance functionality. It does not improve the perception of the final user, but it improves the perception of the developer in charge of the application i.e. it improves the maintainability [Lehman '85]. In that sense, although anti-regressive work is not an urgent task, it is very important because anti-regressive work affects the ease of introducing changes and, as a result, the life expectancy of the application. Therefore, it is necessary to allocate time for anti-regressive work, and use that time efficiently.
Summarizing, software maintenance is the longest and most expensive stage in the software lifecycle. Software maintenance requires changes in the application to keep supporting users’ needs. However, achieving those changes in the long term is difficult because the original developers may not work anymore in the application, and because the introduction of changes increases the complexity of the application’s structure making the introduction of future changes difficult. In order to facilitate future changes, it is necessary to do anti-regressive work to keep the complexity stable, and the design understandable. The following section discusses in more
1 Software aging has been used in two ways: (i) Systems incapable of meeting recent needs of its users, due to structural degradation. Such structural degradation is usually due to modifications that did not comply with the original design [Parnas '94]. (ii)Systems that run for long enough that some performance problems emerge (e.g. due to inaccessible memory because of lost pointers) until the system halts and needs to be re-started [Grottke '06].
1.1. Motivation: structure and maintenance
detail the relation between the structure of the application and its maintainability.
1.1.3
A good design facilitates maintenance
A good design reflects the domain concepts and provides abstractions to separate the concerns among those concepts. Such characteristics are highly desirable for maintenance tasks because bugs are introduced by lack of awareness of the parts involved in executing a requirement [Coleman '94], and because maintenance time is spent on understanding the application to decide which parts to change [Singer '98].
Most models proposed to measure maintainability take into account the quality of the application’s structure. Figure 1-1 shows several models proposed to assess maintenance; all references to the structure quality are shown in italics. Notice that all models identify structural properties as key elements of maintenance.
In fact, there are empirical indicators to believe that software structure affects software quality. For instance, several studies have found that structural metrics are good indicators of software defects [Basili '96; Ferneley '99; Subramanyam '03]. Furthermore, other studies have found that structural metrics are highly correlated with maintenance effort [Coleman '94; Harrison '98; Fioravanti '01; Bandi '03; Dagpinar '03].
A software application has several levels of granularity: architecture, design, and source code. Therefore, the structure of software may facilitate software maintenance. Nevertheless, it is unclear whether all levels of granularity of software structure improve software maintenance or if different levels of granularity have different impacts. In particular, it is not clear if the source code must satisfy any requirements to achieve a maintainable application.
In consequence, software maintenance is an important problem, which can be partially tackled with an appropriate structure of an application. In order to keep the application maintainable for longer, it is necessary to perform anti-regressive work. However, it is not clear if anti-regressive work should be limited to high structural levels or if it should also cover the source code. The next section aims to detail this gap by rephrasing it as research questions.
Figure 1-1. Conceptual Models of Software Quality. Factors that affect maintainability. (Factors related to the application’s structure are in italics).