Related Concepts - Reverse engineering software ecosystems

3.3 Related Concepts

Project portfolios, product families, collections of unrelated projects, large individual systems, and software distributions, are similar in some aspects to ecosystems. In this section we clarify the similarities, differences, and relationships between these concepts and software ecosystems.

Project Portfolios

Project portfolio management is a term used by project managers and project management or- ganizations to describe methods for analyzing and collectively managing a group of current or proposed projects based on numerous key characteristics. The fundamental objective of project portfolio management is to determine the optimal mix and sequencing of proposed projects to best achieve the overall goals of the organization - typically expressed in terms of hard eco- nomic measures, business strategy goals, or technical strategy goals - while honoring constraints imposed by management or external real-world factors. Typical attributes of projects being ana- lyzed in a project portfolio management process include the total expected cost of each project, consumption of scarce human or material resources, expected timeline and schedule of invest- ment, expected nature, magnitude and timing of benefits to be realized, and relationship or inter-dependencies with other projects in the portfolio.

The goal of project portfolio management is therefore to optimize the revenue of the company. Our goal is to support program understanding and to increase the awareness of the in- teractions between the developers and the projects in an ecosystem. The different goals result in different methods. In our approach, interactive visualization, static analysis, and integration with the lower-level levels are preeminent.

Product Families

Product family engineering is a method that creates an underlying architecture for the product platform of an organization[JRvdL00b]. It provides an architecture that is based on common-

ality as well as planned variabilities. The various product variants can be derived from the basic product family, which creates the opportunity to reuse and diversify the products in the family. Product family engineering focuses on the process of engineering new software products in a way which allows reusing product components and applying variability with decreased costs and time. Product family engineering is about reusing components and structures as much as possible.

Although product families are collections of projects and ecosystems are also collections of projects, the ecosystem concept is more general. In fact, a product family can be considered as a special case of ecosystem. It is an ecosystem in which all the projects share an underlying architecture. It is possible that some of the techniques that we apply on ecosystem analysis can be used with product families. However, it is likely that inside the organization that owns it, the product family is part of a larger software ecosystem.

Like in the case of project portfolio management, the main difference with respect to ecosystems is the goal of the analysis: in one case the goal is understanding, in the other it is extracting and reusing the common architectural components of multiple projects.

30 3.3 Related Concepts Collections of Unrelated Projects

Random collections of projects are similar only since they are larger aggregations of projects but they lack the organizational context of the software ecosystems.

One existing application for the analysis of unrelated collections of projects is code search. Code search engines, such as Krugle (krugle.com), Google (codesearch.google.com), and Koders (koders.com) index a large number of open source software projects, written in multiple lan- guages. Academic research has also been directed towards supporting code search with projects that perform keyword-based search [BNL+06] or other semantics-based approaches [Rei09].

The goal of the code search engines is to encourage reuse by supporting the discovery of similar code [kru09]. A company or an organization which owns an ecosystem would indeed benefit

from being able to search its codebase.

One possible application of analyzing a semi-random group of projects is building a benchmark for design and quality assessment. One would create a set of projects that are representa- tive for a given programming language or for a given technology, and then collect metrics about the systems in the benchmark. These metrics would then be used to assess new systems. The idea can also be applied inside an ecosystem.

Individual Systems

One question we still need to address is: “what is the difference between ecosystems and very large individual systems?”. Both ecosystems and systems are containers of code, and if a system is large enough, there can be a very large number of developers contributing to it.

The first difference is that the goals of the analysis are different. Since an ecosystem contains multiple systems, the problems that are associated with ecosystems are distinct from the ones associated with individual systems. Nevertheless, we show later that ecosystem level analysis represents an entry-point for single system analysis.

The second difference is that a project is a unit of release. This results in dependencies between projects and dependencies between modules inside a project being qualitatively different. When a project depends on another project it depends on a certain version of the second; when a module depends on another, the dependency does not involve any explicit versions of the two. One effect of this is that the inter-project dependency graph is less cluttered than the intra-project dependency graph.

The third difference is that a project is a unit of deployment (frameworks can be considered as exceptions) whereas an ecosystem is not deployable, nor runnable itself. This limits the type of analysis that one can perform on an ecosystem. For example dynamic analysis and performance optimisation, which are intrinsically related to an individual project do not necessarily benefit from the information available in the ecosystem. However, the results of the analysis performed on individual projects can be compared at the ecosystem level. For example, in this way one could discover that projects that use a certain technology are more efficient than others. Software Distributions

Linux distributions are probably the most complex software ecosystems that currently exist. Built around the Linux kernel such distributions collect together applications that interact with each other. However, there is no central coordination and there are no common goals for the teams. These applications are subject to complex dependency graphs between themselves. A distribution needs a large number of volunteers that manage the dependencies between the

31 3.4 Benefits of Ecosystem Analysis

In document Reverse engineering software ecosystems (Page 51-53)