A Cost-Benefit Approach to Recommending Conflict Resolution for Parallel Software Development

(1)

A Cost-Benefit Approach to Recommending Conflict Resolution for Parallel

Software Development

Nan Niu∗, Fangbo Yang∗, Jing-Ru C. Cheng†, and Sandeep Reddivari∗ ∗ _{Department of Computer Science and Engineering, Mississippi State University, USA} † _{Information Technology Laboratory, U.S. Army Engineer Research and Development Center, USA}

[email protected], [email protected], [email protected], [email protected]

Abstract—Merging parallel versions of source code is a common and essential activity during the lifespan of large-scale software systems. When a non-trivial number of conflicts is detected, there is a need to support the maintainer in investigating and resolving these conflicts. In this paper, we contribute a cost-benefit approach to ranking the conflicting software entities by leveraging both structural and semantic information of the source code. We present a study by applying our approach to a legacy system developed by computational scientists. The study not only demonstrates the feasibility of our approach, but also sheds light on the future development of conflict resolution recommenders.

Keywords-software merging; conflict resolution; recommen-dation; cost-benefit analysis;

I. INTRODUCTION

Parallel changes, in which separate lines of development are carried out by different developers, are a basic fact of developing and maintaining large-scale software systems [1]. An optimistic version control mechanism [2] allows every developer to work on a local copy of the software artifact independently. Thus, a fundamental challenge in building and evolving complex large-scale software systems is how tomergeparallel versions and variants of a software product to yield a consistent shared view.

The literature on software merging is extensive, e.g., Mens [3] provided an excellent summary of all but the most recent work in the field, showing the diverse range of tech-niques employed. The software merging process considered in our work is shown in Fig. 1, in which boxes represent entities and diamonds represent activities. Currently, there exist many approaches that facilitate the identification and classification of inconsistencies, such as discovering the differences at syntactic [4] or semantic [5] levels. The dimensions of recommendation tools for detecting software conflicts are discussed in [6].

In contrast to the considerable support for conflict detec-tion, there has been modest support for conflict resolution. For example, earlier work on Infuse [7] proactively modular-ized the code base into workspaces so that developers work-ing in separate workspaces would encounter few conflicts caused by parallel development. More recently, Palant´ır [8] informed code changes across workspaces by calculating

a simple and coarse-grained measure of severity of those changes. While these approaches help separate the concerns and raise developer’s awareness, little work has been done on recommending a fine-grained order for the maintainer to investigate and resolve multiple conflicts. Lack of this support is a serious problem because addressing software conflicts in a random order can be inefficient and costly [6]. In this paper, we shorten the gap by presenting a cost-benefit approach to ranking the conflicting software entities in an ordered list. The bottom path of Fig. 1 highlights this recommendation in the software merging process. Our goal is to provide relevant and valuable information to improve the efficiency when a maintainer resolves multiple conflicts. To that end, we leverage Semantic Diff [5] to quantify thecost of conflict resolution for each procedure. We then characterize the resolutionbenefit of a procedure according to the change impact caused by the global variables. The final recommendation is made by sorting the conflicting procedures in a decreasing benefit_cost order.

The contributions of our work lie in the development of a conflict investigation mechanism by synthesizing struc-tural and semantic information of the source code. Our approach is particularly applicable for situations where: 1) a single maintainer is directly responsible for resolving all the conflicts resulted from parallel changes; and 2) the primary source of conflict resolution is the code base (e.g., change intent is not well documented, original developers become unavailable, etc.). We collaborate with computa-tional scientists to address their software merging needs, and apply our approach to a legacy project that exhibits the two characteristics mentioned above. In what follows, we present our cost-benefit recommender in Section II, describe the application of our approach in Section III, discuss related work in Section IV, and conclude the paper in Section V.

II. A COST-BENEFITRECOMMENDER A. Cost Estimation

To estimate the effort required to fix conflicts, we adopt Semantic Diff [5], a tool that takes two versions of a procedure and reports the semantic differences between them. Because Semantic Diff outputs fewer false positives when compared with commercial, text-based diff tools [3],

(2)

Version n _DevelopmentParallel

Version n.1 Version n.2 Version n.m Inconsistency Detection and Classification Conflicts Recommendation for Resolution Ranked Conflicting Software Entities Conflict Resolution Version n+1

Figure 1. Software merging process depicted in a flow diagram, the bottom of which highlights the recommendation for conflict resolution.

adopting it in our approach makes the recommendation more reliable. Key to Semantic Diff is the concept ofdependence pairdefined to capture a procedure’s semantic effect. Specif-ically, a pair of variables, (x, y), forms a dependence pair ifx’s value after execution of the procedure depends on y’s value before the procedure is executed [5].

Consider the procedure node get local of Version n.1

in Fig. 2 as an example. The assignment statement gives rise to the dependence pair, (node pair[ind].sd, gn.sd). We extract a procedure’s dependence pairs based on the formulas defined in [5], and use DP(p) to denote the set of all dependence pairs generated from the procedure p. A procedure is considered as a conflicting procedure if its parallel versions result in different sets of dependence pairs. In another word, a conflicting procedure exhibits different semantic effects during parallel changes.

We estimate the cost of fixing a conflicting procedure according to the number of inconsistent dependence pairs identified between the procedure’s parallel versions. Take the procedure in Fig. 2 as an example:

cost(node get local) =

|DP(node get local_n.₁)∪DP(node get local_n.₂)| −|DP(node get local_n.₁)∩DP(node get local_n.₂)|. This operation is based on our conjecture that the maintainer needs to inspect and evaluate all the identified semantic differences. Because Semantic Diff is accurate in detecting software conflicts [5], our cost approximation represents a conservative estimate at the intra-procedural level.

B. Benefit Estimation

In contrast to the intra-procedural cost estimation, the benefit of conflict resolution in our approach is considered at the inter-procedural level. When changing a procedure from inconsistent to consistent, the effect of the changes may not be local (i.e., within the procedure only), but can propagate to the rest of the system [9]. The control of such a ripple effect [10] requires both the recognition of system interconnections and the coordination of modifying interdependent entities. Version n.2 node_get_local {……… if(nd_list_item == NULL { ……… node_pair[ind].sd = gn.sd; ………. } } Version n.1 node_get_local {……….. edge_flags[ind] = NORMAL; node_pair[ind].sd = gn.sd; ……… } DP Generation (node_pair[ind].sd, gn.sd) (node_pair[ind].sd, nd_list_item) ……… (edge_flags[ind], NORMAL) (node_pair[ind].sd, gn.sd) ……… DP(node_get_localn.2) DP(node_get_localn.1) Cost Estimation cost to resolve node_get_local

Figure 2. Using Semantic Diff’s DP (dependence pair) generation to approximate the cost required to resolve a conflicting procedure.

Input: a set of all proceduresP, a set of all global variablesG, and a set of conflicting proceduresCP⊆P

Output:benef it(p)for allp∈CP

Main Procedure

Initializeint[]benef it //indexed byp∈CP

Foreachp∈CP

If[ (x, y)∈DP(p)AND(x, y)contributes tocost(p)ANDx∈G]

Foreachq∈P

If(z, x)∈DP(q)

benef it(p)++

BREAK Returnbenef it

Figure 3. Calculating the benefit of conflict resolution. In our work, we focus on the interconnections caused by global variable references; however, other types of depen-dency (e.g., API call usages [11]) can also be incorporated. Fig. 3 outlines the algorithm for calculating the benefit of resolving conflicting procedures. The key idea is to leverage the intra-procedural dependence pairs to track the inter-procedural semantic dependencies connected by some global variable.

The algorithm listed in Fig. 3 considers every conflicting procedure p and its dependence-pair set DP(p). For a dependence pair (x, y) that causes p to be inconsistent, i.e., (x, y) contributes to the calculation of cost(p), the algorithm checks whetherxis a global variable. if it is, then a local modification withinpcan propagate the change effect throughout the system viax. In particular, procedureq will be influenced by the ripple effect if (z, x)∈DP(q). In this sense, q benefits from the resolution ofpin that changing

(x, y) from inconsistent to consistent saves the maintainer from investigating (z, x) in q. The benefit of resolving p increments each time q is identified. Note thatq is looped over the set of all the procedures since conflict resolution can affect those procedures originally found to be consistent. C. Recommendation List

The recommendation list is made by computing the benefit_cost ratio and ranking the conflicting elements in a descending order of the ratio. Fig. 4 shows our recommender’s prototype implementation. The maintainer specifies the merging scope by opening the root folder of each parallel version in the tool. The scope can range from the entire project to specific

(3)

(a)Using cost-benefit ratio to rank the conflicting procedures

(b)Showing details for further investigation of a particular procedure Figure 4. Screenshots of the prototype recommendation tool. modules or libraries. The prototype tool currently compares two parallel versions and relies on Semantic Diff [5] to detect semantic inconsistencies within the chosen scope.

As shown in Fig. 4a, the conflicting procedures are initially displayed alphabetically. The recommender takes the source code and the parallel change information as input, computes the cost and benefit of conflict resolution as described earlier in this section, and outputs a ranked list as a reference point for the maintainer to resolve conflicts. In case of a tie (e.g., node packi and node unpacki in Fig. 4a), the procedures are listed in a random order. In addition to the default benefit_cost ratio ranking, the maintainer can (re-)order the procedures by other attributes (columns) like name, cost, and benefit (cf. Fig. 4a).

To support further investigation of a particular procedure, the tool shows the detailed explanation in a separate window (cf. Fig. 4b). In this view, the dependence pairs inconsis-tent between the parallel versions are listed. These pairs contribute directly to the cost calculation in our approach. Meanwhile, the benefit of resolving the chosen procedure is explained by providing the change propagation information along with the global variable(s) behind each intercon-nection. The explanation viewer integrates both semantic (i.e., cost estimation) and structural (i.e., benefit estimation) information to offer insights into the recommendations.

III. A PRELIMINARYEVALUATION

We conducted an initial evaluation of our approach by collaborating with the computational scientists and

soft-Table I

CHARACTERISTICS OF THE PARALLEL VERSIONS

HDTn.1 HDTn.2

# of procedures 928 922

Mean LOC?_{per procedure} ₇₈_.₈ ₈₂_.₀ Mean SLOC†_{per procedure} ₆₈_.₆ ₇₁_.₆

Mean|DP|per procedure 69.0 72.4

# of conflicting procedures 144 nn

?_{LOC: lines of code.}

†_{SLOC: source LOC (neither comment nor empty line is considered).}

Table II

IMPACT SET SIZE OF THE CONFLICTING PROCEDURES

Alphabetic Impact Impact Our recom-order set size set size mended order

p a 2 12 p d p b 8 8 p b p c 5 8 p e p d 12 10 p f p e 8 5 p c p f 10 2 p a

ware engineers at the U.S. Army Engineer Research & Development Center (ERDC). The objective is to assess the feasibility of our approach, and more importantly, to identify areas for improvement. In addition, we would like to explore how software conflicts are handled in practice.

The subject system of our study is a large-scale scientific application written in C. In order to honor confidentiality agreements, we use the pseudonym “HDT” to refer to the system. Table I lists some basic information of the two parallel versions in our study. These HDT variants are very similar in terms of the statistics reported in Table I. It is interesting to note that the|DP|: SLOC ratio per procedure in HDT is roughly1:1, whereas a4:1 ratio was reported (about 400 dependence pairs for a 100-SLOC procedure) when an industrial software system written in C from the real-time domain was analyzed [5]. The relatively low ratio suggests that fewer semantic dependencies exist in HDT than the system studied in [5]. This could be due to the modular design and regular refactoring of HDT, allowing process libraries (e.g., sediment transport, multiple elemental cycles, etc.) to be accessed as required for a specific application.

HDT is a legacy software system that has been evolved for several decades. Parallel changes are common and core HDT development activities. Currently, a single maintainer takes primary responsibility for merging HDT’s multiple versions and variants. Although automated tools such as DiffMerge

(http://www.sourcegear.com/diffmerge) are employed, the

chal-lenge of conflict resolution remains. Based on an informal interview with the HDT maintainer, when 5 to 10 inconsis-tent procedures are identified, deciding which one to resolve first can be challenging. In some situations, the maintainer resolves conflicts based on the order in which they show up in DiffMerge, e.g., by following an alphabetic order. Part of the problem is that there exists insufficient documentation of change intent. Therefore, the code base typically serves

(4)

Table III

COMPARISON OF OUR COST-BENEFIT RECOMMENDER WITH OTHER RELATED APPROACHES BY THE DESIGN DIMENSIONS DISCUSSED IN[13]

Nature of Recommendation Output

Approach Description Context Engine Mode Explanation User

Input Data Ranking Mode Presentation Feedback

Infuse[7] Partition workspaces to Hybrid Source Code No Push Batch None None

reduce interferences & Change

Flexible[14] Offer possible resolution Hybrid User No Push Inline None Individually

paths for users to choose Interface Adaptive

TUKAN[15] Assign weights to Implicit Source Code Yes Push Inline Detailed Globally

software conflicts & Change Adaptive

Treemap[16] Visualize artifact states Implicit Source Code No Push Inline None Globally

from other workspaces & Change Adaptive

Raise awareness across

Palant´ır[8] workspaces along with Hybrid Source Code No Push Inline Detailed Globally

severity of conflicts & Change Adaptive

Rational Detect and revolve Implicit Model No Pull Batch None None

Mgmt[17] conflicts in models

Cost-benefit Rank conflicting entities

Recommender based on resolutin’s Implicit Source Code Yes Pull Batch Detailed None

(our approach) cost and benefit & Change as the main source for conflict investigation and resolution. We believe our approach can improve the HDT project’s merging process (cf. Fig. 1) by recommending a systematic order as well as the detailed explanations about each of the recommendations (cf. Fig. 4).

We applied our approach to one of HDT’s core modules specified by the ERDC experts. In this module (M), 6 conflicting procedures (p atop f) were identified – a subset of the 144 procedures reported in the bottom row of Table I. The domain experts also shared with us a set of test cases used to verify the correctness ofM. We ran these test cases and recorded the execution traces that logged the sequence of procedures being activated at runtime. We then adopted the dynamic analysis technique described in [12] to determine the impact set of each conflicting procedures. For instance, ifp gwas executed afterp a, thenp gwould belong to the impact set ofp a. Compared to our use of static information, dynamic analysis could better predict a procedure’s change impact to the whole system [12]. Therefore, we used the size of the impact set as a surrogate measure of “benefit” considered in our approach: the larger the size, the more beneficial the effect of conflict resolution.

Table II compares two conflict resolution orders: alpha-betic and our recommended order. While the alphaalpha-betic order leads to a random distribution of impact set size, our recommended order exhibits a trend of decreasing impact set size. This implies that our approach can contribute to a more systematic and efficient conflict resolution strategy. The “outlier” procedure in Table II, p f whose impact set size is 10, warrants further investigation. In addition to carrying out more evaluations, we have also identified areas for improvement by collecting the feedback from the ERDC maintainer. One primary issue is to explicitly annotate the version information in our recommender. Also of interest would be integrating the recommendations more seamlessly into software development environments (SDEs).

IV. RELATEDWORK

In an attempt to classify the design dimensions of rec-ommendation systems for software engineering (RSSE), Robillard et al. discussed three major components (nature of the context, recommendation engine, and output mode), along with two crosscutting features (explanation and user feedback) [13]. Table III uses these dimensions to compare our recommender with other conflict resolution approaches. Our approach establishes the merging context implicitly in that the tool parses and analyzes the parallel programs with-out maintainer’s explicitly specifying contextual information such as change intent. Other approaches like Palant´ır [8] require a hybrid of implicit and explicit context gathering. We feel that automatically extracting the context for RSSE, though challenging, is promising as researchers and tool builders can take advantage of the ever-growing quantities and types of software development data [18].

RSSE must analyze more than context data to make rec-ommendations [13]. In our approach, the recommendation engine takes the source code and the concurrent changes as input. Although the code base serves as primary source for conflict resolution for most approaches shown in Table III, taking into account of high-level artifacts (e.g., models [17]) can potentially increase the power of the recommendation engine. However, it is important to synchronize heteroge-neous artifacts collected throughout the software lifecycle.

Despite the key role that ranking plays in RSSE [13], it is surprising to note that, among the existing approaches surveyed in Table III, only TUKAN [15] supports a fine-grained ranking mechanism. Our approach quantifies the cost and benefit of conflict resolution and puts the entities most valuable to the maintainer at the top of the ranked list. This feature enables a systematic way to investigate and resolve software conflicts.

Most software merging recommenders (e.g., [7, 8, 14, 15, 16]) operate in push mode in which the tools deliver results

(5)

continuously. These tools notify relevant users when a con-flict emerges. Pull mode is different from push mode in that users explicitly request for recommendation generation [13]. For example, the model management tool presented in [17] requires user interaction to choose from the conflicting operations. Similarly, users receive recommendations from our tool simply by clicking a button. Some tools integrate into other environments, following an inline presentation style, e.g., Palant´ır [8] is built upon version control systems and TUKAN [15] is integrated in the VisualWorks/EVNY Smalltalk environment. In contrast, our approach outputs the set of recommendations in separate views (cf. Fig. 4), working in a batch mode. As the prototype tool is only a proof-of-concept, we plan to improve the implementation by shifting our recommender’s output toward a more non-obstructive pull and inline mode.

In order to justify the recommendations made, we give detailed explanations about each recommended procedure. We not only provide rankings like TUKAN [15] does, but also offer rationales behind them. Some recommendation en-gines take user feedback into account, so that the interaction between user and the system can affect the recommendation results. For those collaborative software development tools like Palant´ır [8], the recommenders rely on users’ inputs across multiple workspaces. Thus, they incorporate globally adaptive user feedback. We plan to incrementally develop feedback mechanism, maybe by starting at the locally ad-justable level [13], in order to extend our tool along the collaborative conflict resolution dimension [6].

V. CONCLUSIONS

The ability to merge parallel changes is needed during the development and maintenance of large-scale software systems. In this paper, we have presented a cost-benefit approach to ranking the conflicting software entities by exploiting both structural and semantic information from the code base. We developed a prototype tool and carried out a preliminary evaluation by applying our recommender to an industrial-strength software project.

From our initial experiences with the approach, we feel that it has rich value in helping maintainers to understand, justify, and manage conflict resolution, and software merging in general. In the future, more in-depth empirical studies are needed to lend strength to the preliminary findings reported here. We also plan to enhance our tool by annotating the version information and by inlining the recommendations into the SDEs.

ACKNOWLEDGEMENT

The work is supported by the U.S. Army Engineer Research & Development Center (Award W912HZ-10-C-0101). Any opinions, findings, and conclusions expressed herein are the authors’ and do not necessarily reflect those of the sponsor.

REFERENCES

[1] D. E. Perry, H. P. Siy, and L. G. Votta, “Parallel changes in large scale software development: an observational case study,” inInternational

Conference on Software Engineering (ICSE), Kyoto, Japan, April

1998, pp. 251–260.

[2] R. Conradi and B. Westfechtel, “Version models for software config-uration management,”ACM Computing Surveys, vol. 30, no. 2, pp. 232–282, June 1998.

[3] T. Mens, “A state-of-the-art survey on software merging,” IEEE

Transactions on Software Engineering, vol. 28, no. 5, pp. 449–462,

May 2002.

[4] N. Niu, S. Easterbrook, and M. Sabetzadeh, “A category-theoretic approach to syntactic software merging,” inInternational Conference

on Software Maintenance (ICSM), Budapest, Hungary, September

2005, pp. 197–206.

[5] D. Jackson and D. A. Ladd, “Semantic Diff: A tool for summarizing the effects of modifications,” inInternational Conference on Software

Maintenance (ICSM), Victoria, BC, Canada, September 1994, pp.

243–252.

[6] P. Dewan, “Dimensions of tools for detecting software conflicts,”

inInternational Workshop on Recommendation Systems for Software

Engineering (RSSE), Atlanta, GA, USA, November 2008, pp. 21–25.

[7] D. E. Perry and G. E. Kaiser, “Infuse: A tool for automatically managing and coordinating source changes in large systems,” inACM

Annual Conference on Computer Science (CSC), St. Louis, MO, USA,

February 1987, pp. 292–299.

[8] A. Sarma, Z. Noroozi, and A. van der Hoek, “Palant´ır: raising aware-ness among configuration management workspaces,” inInternational

Conference on Software Engineering (ICSE), Portland, OR, USA,

May 2003, pp. 444–454.

[9] I. I. Brudaru and A. Zeller, “What is the long-term impact of changes?” inInternational Workshop on Recommendation Systems for

Software Engineering (RSSE), Atlanta, GA, USA, November 2008,

pp. 30–32.

[10] K. Chen and V. Rajlich, “RIPPLES: tool for change in legacy soft-ware,” inInternational Conference on Software Maintenance (ICSM), Florence, Italy, November 2001, pp. 230–239.

[11] C. McMillan, D. Poshyvanyk, and M. Grechanik, “Recommending source code examples via API call usages and documentation,” in

International Workshop on Recommendation Systems for Software

Engineering (RSSE), Cape Town, South Africa, May 2010, pp. 21–25.

[12] T. Apiwattanapong, A. Orso, and M. J. Harrold, “Efficient and precise dynamic impact analysis using execute-after sequences,” in

International Conference on Software Engineering (ICSE), St. Louis,

MO, USA, May 2005, pp. 432–441.

[13] M. P. Robillard, R. J. Walker, and T. Zimmermann, “Recommendation systems for software engineering,”IEEE Software, vol. 27, no. 4, pp. 80–86, July/August 2010.

[14] W. K. Edwards, “Flexible conflict detection and management in collaborative applications,” in ACM Symposium on User Interface

Software and Technology (UIST), Banff, AB, Canada, October 1997,

pp. 139–148.

[15] T. Sch¨ummer, “Lost and found in software space,” inAnnual Hawaii

International Conference on System Sciences (HICSS), Maui, HI,

USA, January 2001.

[16] P. Molli, H. Skaf-Molli, and C. Bouthier, “State Treemap: an aware-ness widget for multi-synchronous groupware,” inInternational

Work-shop on Groupware (CRIWG), Darmstadt, Germany, September 2001,

pp. 106–114.

[17] M. Koegel, J. Helming, and S. Seyboth, “Operation-based conflict detection and resolution,” in ICSE Workshop on Comparison and

Versioning of Software Models (CVSM), Vancouver, BC, Canada, May

2009, pp. 43–48.

[18] A. Zeller, “The future of programming environments: integration, synergy, and assistance,” inFuture of Software Engineering (FOSE), Minneapolis, MN, USA, May 2010, pp. 316–325.