4.2 Software Engineering KA
4.2.5 Software Maintenance
SWEBOK defines software maintenance as the totality of activities required to provide cost-effective support to software [17]. These activities involve preparation tasks before the deployment of software as well as all that is required to keep the software running
smoothly after deployment. In addition to all the pre and post deployment tasks, planning and scheduling these tasks are equally important.
The micro categories according to SWEBOK for the Software Maintenance KA are:
1. Maintenance Process
2. Techniques for Maintenance
Table 4.8 lists the research studies found related to this KA. Table 4.7: Software Maintenance Micro Categories
Software Maintenance Micro Categories Papers Count Maintenance Process – Techniques for Maintenance [96], [169], [170], [176] 4
4.2.5.1 Analysis
Only 4 research studies were categorised under the Software Maintenance KA, all of which discussed techniques for maintenance. One of the techniques by Li et al., provided a per- formance evaluation framework for identifying potential performance issues in MapReduce and conduct performance optimization of big data system components powered by MapRe- duce [96]. Yim introduced a fault tolerant automation framework to automate end-to-end software deployment procedures and proposed principles and techniques designed to test the automation programs for such software deployment [169].
Yongpisanpop et al., designed a bug tracking system that could help keep track and visualize the bugs reported during the maintenance phase of a big data system [170]. Zhou et al., discuss an empirical study on quality issues of big data systems and a diagnosis of commonly adopted mitigation solutions for development and maintenance practices of production big data platforms [176].
4.2.5.2 Open Research Challenges
There were no research studies found that discussed maintenance processes for big data systems.
1. Maintenance Process - Once a software product is deployed, its performance needs to be closely and constantly monitored and any issues tracked in order to solve them as soon as possible. After some components have lived their life, and newer and better technology comes on the market, older components of the existing system need to be replaced or updated. According to the IEEE Std 14764-2006 [73], maintenance process activities mainly involve process implementation, problem and modification analysis, modification implementation, maintenance review/acceptance, migration and software retirement. The vigorous influx of new technologies in the big data scene means that there is always the possibility of finding more efficient and sometimes even cheaper substitutes to currently deployed algorithms and existing frameworks of a big data system. There should be established processes to guide the deployment and software migration of big data systems all the while keeping the integrity of the big data system components intact. Modification requests must be accepted and clearly understood by the maintenance team, developers who had been involved in the
whole system or specific component to be modified and the appropriate management team. Traditional incident management mechanisms and off-the-shelf software may be insufficient for big data systems because of the rapid changes that in-database analytics algorithms perform. For any error or bug to be found and an incident reported to the maintenance team(s), sophisticated state capturing mechanisms and stack traces should be in place to help in tracking all the incidents and getting all the information that could have caused the error or bug.
There is a good chance of failure(∼16%) during software deployment of big data sys- tems developed through iterative software life cycle models like agile and that in many cases it is attributed to human errors, out of which ∼51% could have been prevented via automation [169]. Observations like these could move research into software maintenance towards automation of maintenance processes and techniques. The need for automation for big data systems is justifiable due to the complex and new technologies being used and with which most software developers, testers or system reliability engineers may not have familiarized themselves. Familiarity with the pitfalls of deploying a complex techni- cal system in a production environment is only possible through experience, but most the technologies incorporated in big data systems have been around for less than a decade. More procedure driven processes that have been formalized and planned in advance to account for the unpredictable nature of big data technologies and big data sources and are automated to perform the deployment processes could be the answer to the challenges in software maintenance of big data systems.
Open research challenges in software maintenance KA include:
• Maintenance Tools: Software maintenance tools would help - in deploying cor- rected or updated code into production big data systems, to track errors and outages
in the production environment, and to report errors and performance issues to re- sponsible personnel. How to create incident managements systems for the multiple interfacing technologies involved in a big data system? How to prioritize errors and fix service level agreements for each kind of error to manage the response appropriate for each error? How to capture the system state prior to a major production outage for efficient root cause analysis?
• Maintenance Management: Maintenance management involves identifying and prioritizing the maintenance tasks according to the degree of severity of problems created by not performing them and in devising techniques to perform them in the most efficient manner. How to analyse and develop a priority scale for conflicting maintenance tasks? How to decide which activities during and after deployment would be better off automated? How to automate software deployment and mod- ification activities in a big data system? Given that big data systems are fairly complicated, how to decide when a maintenance task is too complex and needs to be assigned to the development team? How to decide on when to perform scheduled maintenance tasks?