Development in application engineering should be avoided as far as possible, as increases the complexity of SPL evolution management [Kru06]. Unfortunately, companies cannot always avoid it. Different reasons why products need to be adapted after being derived from the SPL release have been reported by the industry: to meet changing products’ deadline & budget [DSB05, Jen07, Sch06a], to expedite bug-fixes when close to a release [FSK+16], to speed up unexpected functional
changes in customers needs [NNK16, CKM+08, IMY+16], to decrease reusable asset
complexity with single-product needs [DSB05, KH12, BB11], and specially, during the first stages of an SPL, where an initial partial SPL does not provide the 100% functionality required by the products, application engineering teams need to develop the “remaining” functionalities themselves [JB09, KST+14, TFC+09]. We refer to this
development as“product customization”.
Following the grow-and-prune model, product customization (i.e the growth) needs to be cleaned up by merging and refactoring (i.e. pruning) [FV03]. The pruning requires SPL engineers to analyse how core-assets are being customized, i.e. looking at the difference between core-assets and namesake assets once customized by products. In this context, a new range of concerns arise: how much effort are product developers spending on product customization?; how and which customizations need to be promoted to the core-asset base?; which are the most customized core-assets?; to which extent is the core-asset code being reused for a given product?; etc. We refer to this endeavor as“customization analysis”. Customization analysis is intended to help engineers plan the next SPL release according to products’ needs. Evidences from industry revealed that customization analysis is periodically performed by domain
1The content of this Chapter has partially been previously published in [MDA17], and it is currently under
experts, which inspect the source code versions looking for any functionality deemed useful. Below are two excerpts from two different industrial case studies:
“You must carry out such an effort with the support of the best domain experts of the system. Domain experts are required because only they understand the subtle differences between code unit versions and the needs of the users as they evolved historically, so are best equipped to prune and consolidate”. [FV03]
“... all required changes during product derivation are handled through product specific adaptation.Periodically, the functionality that is deemed useful for the product family is incorporated in the family assets.” [DSB05]
Traditional DIFF utilities might help to see the differences between the core file and the same file once customized by a product [SSRS16]. However, this one-diff-at-a-time approach can hardly scale up to SPLs, where both products and core assets can easily account for hundreds of files. Needed are mechanisms that move from code-level DIFF to assessing differences at higher abstraction terms: features and products. Rather than DIFF(aFile, aFile), we long for DIFF(aFeature, aProduct) utilities that encapsulate the scanning of potentially hundreds of products for all the files a given feature has an impact upon. This involves gathering data from thousands of DIFFs. But this is just raw data that needs to be cleaned-up and aggregated in meaningful analysis terms. Due to this issue the following problem arises:analyzing how products customized core-assets is time-consuming and error-prone.
Refer to Figure 3.1, which depicts the problem definition as a mind map, and outlines the causes and consequences of the problem. Refer to Chapter 1 for a detailed description on the root-cause analysis of the problem (i.e. cause and consequences of the problem). The reader is encouraged to interact with the mind map at https://tinyurl.com/yay46us8. The nodes can be unfolded to uncover the supporting evidences for each of the claims.
Fortunately, mechanisms already exist that help: data warehouses. Data warehouse (DW) is a collection of decision support technologies, aimed at enabling knowledge workers to make better and faster decisions [KR02].
In this Chapter we study the use of DW for customization analysis. Specifically, our work elaborates around three main research questions:
• RQ1: Which are the information needs for customization analysis? How much time is needed to get these information needs?
• RQ2: To what extent can previous information needs be satisfied through a data warehouse? If so, what would its Star Schema look like?
• RQ3: How can customization analysis be visualized?
This work aims at contributing to the previous research questions as follows:
• RQ1. We introduce a set of questions that might arise during "feature evolution" and "product evolution". The importance and required time to answer to these
Figure 3.2: WeatherStationSPL branching model: the master branch holds the core- assets baselines from where SPL products are branched off.
questions is addressed through a questionnaire delivered to SPL practitioners (Section 3.5),
• RQ2. We develop CustomDIFF, a DW approach that uses Git as the operational system from where fact data is obtained, and pure::variants as the SPL framework (Section 3.6),
• RQ3. We resort to Alluvial diagrams to visualize the customization effort at a glance. These diagrams are a type of flow diagrams. Here, the flow stands for the customization effort that goes from core-assets to SPL products where customization was needed (Section 3.7).
Next Section illustrates the challenge with a motivating scenario.