Software Evolution Analysis - Reverse engineering software ecosystems

tem since they do not require a priori knowledge about the system. For this type of processes interaction and support for exploratory analysis is critical[Riv04; KC99]. About the classical

bottom-up tool Dali, Kazman et al. declare that “probably the most important component of the Dali workbench is the interaction element” (Kazman, 1999).

Commonalities between the two processes

There are two factors that unite the two previous architecture recovery processes:

1. The majority use multiple views to capture different aspects of the subject system’s architecture. The different views can correspond to the same viewpoint like in the case of Bowman et al.[BHB99] or present different viewpoints like in the case of Pinzger [Pin05].

2. They all involve the user playing an active exploratory role in the process. The user’s tasks vary based on the starting state of the exploration process, and the operations that he has at his disposition:

• In some processes the user starts the exploration with a view which contains all the artifacts in the system. He then proceeds to refine the view by filtering and grouping elements in the view. This is the case with most of the approaches based on Rigi [MK88;OS01;KC99;Riv04_].

• In some processes the user starts the exploration from a very high-level view of the system. He then explores the architectural information by bringing information into the view as needed. This type of navigation is the dominant interaction in SHriMP and its Eclipse-based follow-up, Creole[SM95;LMSW03] as well as in our prototype

Softwarenaut[LKGL05].

• In some processes the user starts with a blank visualization. He then proceeds to add elements as he explores and discovers the system. Janzen and De Volder use this approach in JQuery[JV03] and Sinha et al. use this approach in Relo [SKM06].

Even if there are no fully automatic techniques, there is still a need for increasing the amount of automation of the processes and decreasing the human involvement.

2.4 Software Evolution Analysis

In recent years considerable research effort has been directed towards understanding software evolution. The basis for the work is the widespread use of versioning control systems and the availability of open-source case studies.

There are two main research directions when studying system evolution: discovering general principles of software evolution, and supporting program understanding and maintenance. We elaborate on the main directions here.

Discovering general principles

In 1985 Lehman proposed a number of “laws of software evolution” _[LB85; LPR+97_{]. The}

17 2.4 Software Evolution Analysis over many years. The principles are very general. For example, the principle of continuing

changepostulates that systems must be continually adapted else they become progressively less satisfactory. Given the large variability in the environments, technologies and processes for software development, it is not surprising that overarching, more specific principles are hard to find.

When attempting to understand the fundamentals of the evolution of software systems researchers need to investigate large groups of projects. They analyze these projects in parallel, but they rarely look at the entire group of projects as the subject of the analysis. One such example is the analysis of all the projects on SourceForge by Weiss[Wei05].

The Libresoft research group in Spain has investigated in several occasions entire collections of software projects. In one instance, they analyzed the Debian Linux distribution and esti- mated the cost of implementing it from scratch[ARGBH05]. In another article, they proposed

a methodology for analyzing how the developer turnover affects open-source software projects by taking several representative open-source projects and analyzing the information in their versioning system repositories[RGB06]. In yet another case they studied from a social networking

point of view the developers that are working in the Apache and Gnome projects[LFRGBH06].

One project that does not perform evolution analysis per se, but is rather an enabler for this type of analysis is FlossMole[CHC05]. The project compiles every two months a database with

statistics about a very large number of open-source projects. The database includes projects from SourceForge, Freshmeat, RubyForge and a few other public project repositories.

Supporting Program Understanding and Maintenance

Since the versioning repository of a software project is a rich source of information, it is natural that researchers attempt to use this information to support the development and maintenance processes. The existing research in this direction analyzes the evolution of individual systems focusing on different goals:

• Focus on predicting the locations of future changes. Sahraoui et al. studied the in- terfaces of classes in libraries written in object-oriented systems in order to predict the future stability of their interfaces[SBLE00]. Gîrba et al. used historical information about

a system to suggest starting points for the reverse engineering process based on the as- sumption that those parts of the system that changed recently are the ones who need to be understood first[GDL04]. In a related work, Zimmerman et al. showed that the parts

of the system that have changed together in the past are likely to change togehter in the future[ZWDZ04]. Recently Robbes proposed an approach in which he keeps track of ev-

ery individual change that is collected through the IDE to allow for a more fine grained analysis of the changes[Rob08].

• Focus on increasing the quality of IDE support during forward engineering. Cubranic

et al.have proposed using project information to support the automatic recommendation of the parts of the system which are involved in a maintenance task[CMSB05]. In their

case, project information comprises a number of different sources, including the source code versions, modification task reports, newsgroup messages, email messages, and doc- umentation. Kersten suggested program elements that are likely to be related to the task at hand based on a degree of interest model built from the history of the navigation in the IDE[KM05]. Singer et al. proposed NavTracks a tool that keeps track of the navigation

18 2.5 Towards Reverse Engineering Software Ecosystems

In document Reverse engineering software ecosystems (Page 38-40)