Transferability and Scalability - Predicting change propagation using domain-based coupling

In this section, we discuss the transferability and scalability of the domain-based change impact analysis. We describe its applicability to various software categories and different kinds of software changes.

First, to what categories of software systems do these methods apply? Pressman [152] organ- ised computer software under seven categories: system, application, engineering/scientific, embedded, product-line, artificial intelligence and web applications. Our approach is appli-

cable to subsets of the application software, product-line software and web applications that are data driven and provide their functionality through a number of user-interface components. Our approach is not applicable to software where its functionality is not visible to the domain users, such as the system software or embedded software. Also, domain-based change impact analysis may not be suitable where systems are not data driven or have few user-interface components, such as engineering/scientific or artificial intelligence software. Second, for what kinds of software changes can we use these methods? Lientz and Swan- son [118] classified software changes as perfective, adaptive, corrective and preventative. Pre- ventative changes are typically initiated by programmers/developers or software engineers who are concerned with the non-functional properties of the system, such as the maintainability of the source code. Such changes might be difficult to map to domain functions; therefore, domain-based change impact analysis would not be a suitable approach for this kind of change. However, perfective, adaptive and corrective changes are typically performed in response to a request from the system users or in response to changes in the software environment. Such software changes, if related to changes in the software domain functions, can be assessed using domain-based change impact analysis.

Pressman [152] suggests four fundamental sources of software changes in relation to the business environment: (1) changes of business conditions, (2) changes in customer demands, (3) growth/downsizing of the business, and (4) budgetary or scheduling constraints. All of these changes might be defined as changes to the domain functions or user interface components or both; hence, their impact can be estimated using domain-based change impact analysis.

Finally, how scalable are these methods? Domain-based change impact analysis requires the domain experts’ knowledge about user interface components. InChapter 6, we described how domain information can be derived automatically from various sources, such as the actual working software, but we still rely on the domain experts’ feedback in the final stage of the process. There is a linear relationship between the required effort for the analysis process by the domain experts and the number of user-interface components and domain variables.

7.5 Threats to Validity

We believe that the presented case studies in this thesis can be helpful for other researchers and practitioners. They demonstrate how the domain-based coupling can be derived from information systems, and how it can be used to predict the architectural dependencies and the impact of the software changes. Nevertheless, our results should not be generalised too hastily without first considering the following possible threats to the validity.

Internal validity concerns uncontrolled factors that can be responsible for the results. In ADempiere and BEIMS case studies (Chapters4and5), we identified the following threats to the internal validity:

• The domain information is collected by the domain experts and human error is a factor that can affect the results. To minimise the risk of human error, we extracted the relationship between the domain variables and the UICs from user manuals and help documents. In ADempiere , this information is stored in the database. We used only manual inputs from the domain experts to confirm this information and kept the manual additions and alterations to a minimum.

• One other factor that could affect the results is the granularity of the UICs. In both ADempiere and BEIMS studies, we chose windows as the UICs. Each window con- tains multiple tabs and each tab provides one or more functions. Different results could be achieved if the evaluation were performed on the fine-grained tabs or the coarse-grained modules.

• In the BEIMS case study, we derived evolutionary coupling from the co-changes at the file level. However, a developer can apply unrelated changes to two files in a close time frame. For example, a developer can work on two unrelated bugs in the same time frame and send the changes to the repository as part of the same transaction. Such co-changes can lead to false positive evolutionary coupling and reduce the recall of the prediction results by the domain-based coupling.

• BEIMS is a proprietary software system developed by a single company. The company standard practices and development cultures might influence both the software architecture and the maintenance activities, including the way the developers fix bugs and enhance the system. To reduce this impact, we examined more than 12 years of the

maintenance history of the system. The longevity of this maintenance history reduces the influence of the individual developers by including more developers and different software versions.

External validity concerns the generalisation of our findings. In ADempiere and BEIMS case studies, we evaluated our approach against the large-scale enterprise systems. Although the maturity of the data about these systems provided an insight into the relationships among the architectural dependencies, change propagation and the domain-based coupling, the following limitations in our studies should be considered before generalising our findings:

• ADempiere is developed in JAVA and based on the multi-tier architecture. The architecture of this system is designed to enhance the maintainability and extendibility of the system; it reduces the code coupling and code clones, as such ADempiere manifests the state of the art open source enterprise systems. However, one might get a different result for a system with much code coupling or a legacy system with a flat architecture.

• In BEIMS case study, we examined the five subsystems that operate in the domain of facility management. Although these systems have separate functionalities, they have been developed based on a similar architecture and by the same company. This similarities limit the generalisation of our results to different domains and other systems with different architectures.

Construct validity concerns the relationship between the theory and the observations. In BEIMS case study, we reported on a case study that compared the domain-based coupling with the evolutionary coupling. In these studies, we demonstrated the correlation between the two coupling metrics; domain-based coupling from the system behaviour and evolutionary coupling from the co-changes in the source code repository. However, our observation did not provide support to claim a cause-and-effect relationship between these coupling metrics. The correlation only suggests that one coupling metric can be used as a proxy for the other.

In document Predicting change propagation using domain-based coupling (Page 142-146)