• No results found

Nirikshan: Mining Bug Report History for Discovering Process Maps, Inefficiencies and Inconsistencies

N/A
N/A
Protected

Academic year: 2021

Share "Nirikshan: Mining Bug Report History for Discovering Process Maps, Inefficiencies and Inconsistencies"

Copied!
10
0
0

Loading.... (view fulltext now)

Full text

(1)

Nirikshan: Mining Bug Report History for Discovering

Process Maps, Inefficiencies and Inconsistencies

Monika Gupta

Ashish Sureka

Indraprastha Institute of Information Technology, Delhi (IIITD)

New Delhi, India

{monikag, ashish}@iiitd.ac.in

ABSTRACT

Issue tracking system such as Bugzilla, Mantis and JIRA are Process Aware Information Systems to support business process of issue (defect and feature enhancement) reporting and resolution. The process of issue reporting to resolution consists of several steps or activities performed by various roles (bug reporter, bug triager, bug fixer, developers, and quality assurance manager) within the software maintenance team. Project teams define a workflow or a business process (design time process model and guidelines) to streamline and structure the issue management activities. However, the runtime process (reality) may not conform to the design time model and can have imperfections or inefficiencies. We apply business process mining tools and techniques to ana-lyze the event log data (bug report history) generated by an issue tracking system with the objective of discovering run-time process maps, inefficiencies and inconsistencies. We conduct a case-study on data extracted from Bugzilla issue tracking system of the popular open-source Firefox browser project. We present and implement a process mining frame-work, Nirikshan, consisting of various steps: data extrac-tion, data transformaextrac-tion, process discovery, performance analysis and conformance checking. We conduct a series of process mining experiments to study self-loops, back-and-forth, issue reopen, unique traces, event frequency, activity frequency, bottlenecks and present an algorithm and metrics to compute the degree of conformance between the design time and the runtime process.

Categories and Subject Descriptors

H.4 [Information Systems Applications]: Miscellaneous; D.2.8 [Software Engineering]: Metrics—complexity mea-sures, performance measures

General Terms

Algorithms, Experimentation, Measurement

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.

ISEC’2014 Chennai, Tamil Nadu India

Copyright 20XX ACM X-XXXXX-XX-X/XX/XX ...$15.00.

Keywords

Mining Software Repositories, Process Mining, Software Main-tenance, Issue Tracking System, Open-Source Software, Em-pirical Software Engineering and Measurements

1.

RESEARCH MOTIVATION AND AIM

Issue Tracking Systems (ITS) such as Bugzilla1, Mantis2 and JIRA3 are Process Aware Information Systems (PAIS) used during software development and maintenance for is-sue (feature enhancement requests and defects) tracking and management. Large and complex software projects, both open-source and commercial, define a bug lifecycle with dif-ferent states and transitions for a bug report and a pro-cess model (activities, events, actors, transitions and work-flow) serving as guidelines to the project team. ITS such as Bugzilla logs events generated during a bug lifecycle (such as who reported the bug and when, change of resolution and status field of a bug, developer and component assignment or reassignment) and refers to the event logs as Bug History. Process Mining (intersection of Business Process Manage-ment and Data Mining) consists of mining event logs gener-ated from business process execution supported by a PAIS [10]. The focus of the research presented in this paper is on the application of process mining techniques to mine bug history data and event logs generated from an ITS (software repository). Issue tracking and management is an activity which is integral to software development and maintenance and is fraught with several challenges. The work presented in this paper is motivated by the need to mine process data generated by ITS to extract insights particularly inefficien-cies and inconsisteninefficien-cies. This can be useful to practition-ers for solving problems resulting in productivity and effi-ciency improvements in software maintenance. The specific research aims of the work presented in this paper are the following:

1. To develop a generic algorithm for quantitatively mea-suring the compliance (conformance checking and verifica-tion) between the design time and the runtime (reality) pro-cess model using metrics in the context of issue tracking and management.

2. To investigate the application of process mining plat-forms such as Disco4and state-of-the-art algorithms for dis-covering process maps from event logs (in our case, bug

re-1https://bugzilla.mozilla.org/ 2 http://www.mantisbt.org/ 3https://www.atlassian.com/software/jira 4 http://fluxicon.com/disco/

(2)

port history data) and then analyzing the discovered process map for understanding the actual runtime process.

3. To define a process mining framework and conduct a case-study on popular open-source projects like Firefox and Core from Mozilla foundation for analyzing the per-formance of various process-related activities and identify-ing inefficiencies (statistics on events, activities and tran-sitions) and imperfections (reopening of bugs, bottlenecks, self-loops, back-and-forth between activities).

2.

RELATED WORK AND RESEARCH

CON-TRIBUTIONS

In this section we discuss closely related work (to the search presented in this paper) and present the novel re-search contribution of this paper in context to the existing work. We conduct a literature review in the area of process mining software repositories. Table 1 presents a list of 7 closely related works arranged in a chronological order (year 2006 to 2012). We synthesize existing work and classify the papers in terms of attributes such as software repository (Version Control Systems, Defect Tracking Systems, Mail Archives), project (ArgoUML, Linux, Proprietary), OSS or CSS and research objective. The most closely related re-search to the work presented in this paper is by Sunindyo et. al. [9] in which they proposed a framework for effective engineering process observation in OSS projects. They used hypothesis testing approach to verify design model with run-time event log from bug history data. They obtained pro-cess map for RHEL bug history data using heuristic mining algorithm of process mining tool ProM. The conformance verification is not in-depth and done for few dimensions by proving/disproving the hypothesis. Poncin et. al. [5] com-bined the information from different repositories by match-ing related events. This is achieved by a prototype FRASR (FRamework for Analyzing Software Repositories). The re-sulting log is analyzed using ProM dotted chart visualization and fuzzy miner plugin for role classification and fuzzy graph generation (actual view of bug lifecycle) respectively. How-ever, major emphasis is on organizational perspective and not on analyzing runtime process.

From literature, we observe that major research focus is towards designing algorithm for process discovery, defin-ing possible broad perspectives (such as process, cases and organizational) of event log analysis and its applicability. However, to the best of our knowledge the outcome of pro-cess perspective (discovered propro-cess map) for ITS event log (which is very important repository of a software project) is not analyzed from multiple dimensions. We hypothesize that there is a need to have an in-depth analysis to help process analyst in making informed decisions for the pro-cess improvement and effectively the overall stability of an organization.

In context to previous work, the study presented in this paper addresses the challenges by making following novel and unique research contributions:

1. To the best of our knowledge, the work presented in this paper is the first in-depth study on process mining event logs (bug life-cycle and history data) from issue tracking system (process aware information systems) for discovering runtime process maps (and statistics on activities, events, transitions, unique traces), inefficiencies (issue reopens, bot-tlenecks, self-loops, back-and-forth between activities) and

inconsistencies (an algorithm to compute degree of confor-mance between design time and runtime process models).

2. A case-study and empirical analysis (implementing the process mining framework and conformance checking algorithm) on issue tracking system data of Firefox and Core products (large, complex, long-lived and popular open-source projects) from Mozilla Foundation. We present new results and fresh perspectives (by mining large real-world data) motivating the need and demonstrating the usefulness of process mining event log data or traces generated from issue tracking systems.

3.

RESEARCH METHODOLOGY AND

FRAMEWORK

We conduct experiments on issue tracking system data from Firefox and Core open-source projects. Table 2 dis-plays the experimental dataset details. We chose Firefox and Core for our analysis as these projects are large, com-plex and long-lived. Firefox is Mozilla’s flagship software product which is the third most-used web browser for Win-dows, MAC and Linux. Core includes shared components used by Firefox and other Mozilla software. The bug re-port data for Firefox and Core projects are publicly avail-able and hence our results can be replicated and used for comparison by other researchers. We conduct experiments on two projects (Firefox and Core) to remove one-project bias in our results and conclusions (increase the generaliz-ability of our results). As shown in Table 2, the size of the experimental dataset is 12234 issue reports from Firefox and 24253 reports from Core (one year period: 1 January 2012 to 31 December 2012).

Attribute Value

First issue creation date 1 Jan 2012 Last issue creation date 31 Dec 2012

Date of extraction 14 July 2013

Total issues in 2012 111234

Issues not authorized for access 15638 Issues without history 3149 Total issues extracted for Firefox 12234 Total issues extracted for Core 24253 Total activities for Firefox 40233 (including Reported)

Total activities for Core 88396 (including Reported)

Table 2: Experimental dataset details for Core and Firefox (open source Mozilla Project).

The complete framework (called as Nirikshan which is a Hindi word meaning inspection or review), as depicted in Figure 1, has three major stages: 1. data extraction, 2. data transformation, and 3. process mining.

1. Data extraction: We extract bug report history (re-fer to Figure 2 displaying the snapshot of a Bugzilla bug history page) using Bugzilla APIs (through XML-RPC or JSON-XML-RPC interface). The bug report history serves as the process event log generated by the ITS. We extract Status, Resolution (only for closed), As-signee, QA Contact and Component from bug history to derive the process map. In our experiments, we record creation timestamp for the activity Reported.

(3)

Author(s) Year Software

Repository Project OSS/CSS Objective Kindler et.

al. [3] 2006 SCM

Dummy versioning log to make it in-dependent of SCM system

OSS

Proposed incremental workflow min-ing algorithm to semi-automatically derive process model from versioning information. Weijters et. al. [10] 2006 Dummy event log with random traces NA NA

Distinguished three mining perspec-tives and focussed on the process per-spective. Described and analysed HeuristicsMiner algorithm

Rubin et.

al. [7] 2007 SCM, VCS AgroUML OSS

Linear Temporal Logic (LTL) Check-ing (Conformance Analysis), So-cial Network Discovery, Performance Analysis, Petri Net Discovery Akman et. al. [1] 2009 SCM Industry project (for Turkish Armed Forces) CSS

Analyzed and compared the effective-ness of four process discovery algo-rithms on software process. Analyzed discrepancies between real time and design time process.

Knab et. al. [4] 2010 Issue Track-ing System Industrial Project (EUREKA/ITEA project SERIOUS) CSS

Interactive approach to visualise ef-fort estimation and process lifecycle patterns in ITS to detect outliers, flaws and interesting properties. Poncin et. al. [5] 2011 Bug repositories, mail archives, SVM aMSN, GCC OSS

Combined different repositories for analysis using a prototype, FRASR. Role classification and Bug life cycle construction using ProM.

Sunindyo

et. al. [9] 2012

Defect Tracking

System

Red Hat

En-terprise Linux (RHEL)

OSS

Framework for collecting and analyz-ing data from bug reportanalyz-ing system, conformance checking to improve pro-cess quality.

Table 1: Literature survey of papers at the intersection of mining software repositories and process mining arranged in chronological order.

For open bugs, status can be New,Unconfirmed, As-signedandReopenedwhich is captured as activity. For closed bugs, Verified status is captured as it is while for the status Resolved, the resolution is captured as activity. The resolution can beFixed,Invalid, Wont-fix,Duplicate,Worksforme andIncomplete5. We con-sider every closed resolution as a unique event for de-tailed analysis as it will also capture the transitions be-tween different resolutions. Assignee, QA Contact and Component are recorded asDev-reassign,QA-reassign andComp-reassignrespectively as they are important events in a bug lifecycle and can lead to change in state like a bug should be markedNewafter developer reassignment6. Based on the mentioned significance

of captured activities, we hypothesize that they are important for good characterization of bug’s control-flow (the tasks performed during the progression of a bug). They cover the tasks performed starting from inception to closing of bug. We obtain timestamp cor-responding to an activity from the when field of bug history as can be seen in Figure 2 to order the activ-ities in the sequence of their actual execution (while generating the process map via Disco). The performer of the activity is treated as resource and is captured from thewhofield of bug history.

2. Data transformation: One of the major challenges

5https://bugzilla.mozilla.org/page.cgi?id=fields.html#status 6

http://www.bugzilla.org/docs/2.18/html/lifecycle.html

in software process mining is to produce a log con-forming to the input format of process mining tool [5]. Therefore, data preprocessing is very crucial be-fore analysis in a process mining tool. Following is the processing performed to address the challenges:

• Activities with timestamp, bug ID and associated resource make the event log for process map gen-eration.

• Reported event is not stored in bug history so it is added to the event log with creation timestamp. • Initial state is not captured in history if no

ac-tivity leading to change in status and resolution is performed. For such cases onlyReported (with creation timestamp) is stored in event log and not the initial state. Otherwise, initial state is also stored with the same timestamp asReported. For instance, the process map from the history of a bug is shown in Figure 2.

• There are cases whenComp-reassign,Dev-reassign and QA-reassign have same the timestamp with other activity. This “same timestamp conflict” is resolved by subtracting delta from the timestamp of these activities such that the final ordering in all conflicts (exceptReopen) should be:

Comp-reassign → QA-reassign → Dev-reassign →Other activity

(4)

bugzilla.mozilla.org

Data Extraction

RAW DATA (MYSQL) EVENT LOG

Data Transformation

Case ID, Timestamp Activity, Resource Process Mining Discovery Process Model Verification Conformance Performance Analysis

Figure 1: Nirikshan: Proposed research framework with three major phases as 1. data extraction, 2. data transformation, and 3. process mining.

PROCESS MAP SNAPSHOT OF MOZILLA BUG HISTORY (BUG ID 600028)

Event Log

Activity Resource Timestamp

Figure 2: A snapshot of Mozilla bug report history (event log) and corresponding process map.

developer and QA assignment7 followed by

activ-ities for bug progression. However,Reopenis the reactivation of a bug rather than filing a new bug so it is resolved as:

Reopen→Comp-reassign→QA-reassign→ Dev-reassign.

3. Process mining: The event log can be mined from various perspectives [10]. We perform process mining to discover runtime process model (process perspec-tive), analyze performance and verify conformance of as-is model with the design time process model. We analyze statistics for activity, transitions, events and unique traces from discovered process map. The per-formance analysis is done for reopening of bugs, pres-ence of loops, back-forth transitions and identifying bottlenecks.

4.

PROCESS DISCOVERY

There are many process mining tools such as ProM (open source) and Disco8(commercial) used to obtain process model.

We import preprocessed data into Disco to obtain process map and other statistical information for Core and Firefox. Disco miner is based on the proven framework of the Fuzzy Miner with completely new set of process metrics and mod-eling strategies9. It is used to discover the runtime process from the actual event log generated during the progression of a bug. Bug ID is selected as case (for process map gen-eration in Disco) to associate all activities pertaining to the

7http://www.bugzilla.org/docs/2.20/html/components.html 8

Disco is a proprietary tool for which we availed academic license and used for our analysis.

9

http://fluxicon.com/disco/files/Disco-Tour.pdf

same bug ID and hence, we can visualize the lifecycle of a bug. We have 15 nodes each corresponding to an activity in the process map for both the projects and the directed edge represents a transition between two activities (node). With 15 nodes, we can have a maximum of 210 unique tran-sitions including self-loop given Reported as a source and hence no inward arrow possible. The process map obtained is fairly complex at the highest resolution (including infre-quent transitions) with 160 and 156 unique transitions for Core and Firefox respectively. For clarity, the process map at 20% resolution (core transitions) for Firefox is shown in Figure 3. Label of each edge indicates the absolute frequency of transition. The shade and thickness corresponds to the frequency with more frequent being dark and less frequent as light.

4.1

Activity Frequency Analysis

The relative occurrence of all activities with absolute fre-quency is shown in Table 3. AfterReported(exists for all the cases), the most frequent event is Unconfirmed for Firefox andNewfor Core. This reduces the confirmation efforts for reported bugs in Core. Figure 4 shows the distribution of activities over cases, that is, how much percentage of cases have the given activity in their event log.

As evident from Figure 4, a good percentage of cases have Dev-reassignandComp-reassignevent in lifecycle, also their relative frequency is high which is undesired. We see that QA reassignment is happening quite often. Developer reas-signment is much higher for Core than Firefox. It leads to the wastage of developer’s time and efforts and delay bug fixing. Hence the efforts should be made to minimize reas-signments. A bug once resolved should be verified. However, only around 4% of bugs are verified by the QA manager for both the projects which is significantly less than the resolved

(5)

Figure 3: Process map for Firefox at 20% resolution with labels as absolute frequency and shade proportional to frequency.

bugs signalling inefficiency. It highlights the need to uncover reasons for infrequent verification and address them.

Further, we can clearly observe from Figure 4 that Core has almost double chances of getting a bugFixed which is the recommended situation. While for Firefox, compara-tively a large number of bugs are markedDuplicate,Invalid and Worksforme which means a lot of practitioner’s time is going in addressing issues less useful for overall product quality improvement. To avoid duplicates, an efficient search mechanism can be used before reporting a bug or an exten-sion to give “Duplicate bug” warning at the time of bug submission. Similarly, clarity on what should be reported as bug to reduce invalid bugs and emphasize need for clear description of bugs for better understanding.

4.2

Transition Frequency Analysis

Intuitively, frequent transitions should be the part of de-sign process. We find that frequent transitions are usually part of lifecycle defined for Bugzilla. However, there are some which are infrequent even though allowed in design time process namely, Verified → Reopen with only 14 and 4 cases,Reopened →Assignedwith 20 and 8 cases for Core and Firefox respectively. Sometimes this may be because of the fact that source or destination activity is less frequent. Other reason can be that majority of the cases have transi-tion from a given state to a specific state, so the frequency

0 20 40 60 80 100 Wontfix Incomplete Verified Worksforme Invalid Duplicate Fixed Reopened QA−reassign Dev−reassign Assigned Comp−reassign New Unconfirmed Reported

Percentage of cases with activity in event log

Activity

Core Firefox

Figure 4: Activity distribution over cases showing the per-centage of cases with given activity in event log.

for other possible transitions become low and hence gives an impression that it is not part of design process. The in-frequent occurrence ofAssigned →New transition implies that there is a non-adherence to defined process, that is, a bug should be marked New after Dev-reassign but it is not practiced always as the frequency of transition (25 for Firefox and 30 for Core) is much less than the frequency of developer reassignment for both the projects. Also,Verified

(6)

Activity Rel freq. (Abs)Firefox Rel freq. (Abs)Core Reported 30.41 (12234) 27.44 (24253) New 11.42 (4596) 17.27 (15267) Fixed 6.93 (2788) 14.22 (12573) Dev-reassign 6.65 (2677) 12.98 (11471) Comp-reassign 7.16 (2880) 6.61 (5843) Assigned 4.48 (1802) 5.79 (5121) QA-reassign 3.23 (1300) 3.50 (3095) Unconfirmed 12.16 (4894) 3.27 (2893) Duplicate 5.85 (2354) 2.62 (2319) Works for me 2.93 (1180) 1.81 (1599) Verified 1.27 (512) 1.22 (1081) Invalid 4.21 (1692) 1.22 (1075) Reopened 0.82 (330) 1.18 (1045) Won’t Fix 1.23 (494) 0.66 (587) Incomplete 1.24 (500) 0.20 (174) Total 40,233 88,396

Table 3: Relative frequency percentage (Rel freq.) and ab-solute frequency (Abs) for all 15 activities.

→Unconfirmed transition defined in design process, never occurred in runtime so the resources for maintaining this defined transition are never used.

Mostly undefined transitions are infrequent with few ex-ceptions. Such transitions are analyzed and if found to have good influence on overall efficiency, they can be added as part of design process otherwise steps should be taken to avoid such transitions in future. For instance, Reported → Assigned is occurring in 1659 and 325 cases for Core and Firefox respectively even though not defined in design process. We analyze such cases to determine if chances of developer reassignment are high or low as the bug is di-rectly markedAssignedat the time of reporting. 4.88% and 14.77% cases have developer reassignment when assigned at the time of reporting which is less than the 47.87% and 18.38% cases for indirect transition with respect to Core and Firefox respectively. It means that the reporter had clear idea on who can be the right person to fix bug and assigned directly resulting in comparatively less chances of reassignment.

4.3

Event Analysis

A sequence of events is performed after a bug is reported till the time it is closed. The number of events varies from case to case. We use only closed cases for our analysis. The bugs which are resolved and not reopened thereafter are taken as closed. The sequence of events till a bug is resolved is taken into consideration and the events there-after are ignored. There are total 16989 and 8259 closed cases with 71615 and 31797 events for Core and Firefox re-spectively. Maximum number of cases have 3 events in the lifecycle as indicated in Figure 5 for both the projects. The lifecycle with 3 events can be represented as Reported → New/Unconfirmed→Resolvedwhich is the shortest possible sequence of events for resolution. We uncover that out of all the closed cases, majority of them executed minimum events to get resolved for both the projects. Very less bugs(cases) have more than 10 events in the lifecycle which shows sta-bility. The maximum number of events for a case is 22 for core and 21 for Firefox respectively which is an outlier.

4.4

Unique Traces

3 4 5 6 7 8 9 10 >10 0 10 20 30 40 50 60

Events per case

Percentage of cases

Core Firefox

Figure 5: Events per case with maximum cases having 3 events in the lifecycle.

The complete sequence fromReportedtill last event makes the trace for a case (bug). Different bugs can have different set of transitions and hence different traces. We consider a trace unique even if the single transition is different. For instance, a trace with sequenceA→B→Cis different from A→B→B→Cbecause second case has a self loop which is not there in first case. Only the closed bugs (same as event analysis) are used for experiments. Effectively, there are 1164 and 622 variants for Core and Firefox respectively. However, most of them are infrequent. The distribution is almost same for both the projects. For Core and Firefox, 80% of the cases are covered with around 2% (include variant with atleast 50 cases) and 3% variants (include variant with atleast 29 cases) respectively. Whereas 8% and 13% of the variants are required for coverage of 90% cases. Top ten most frequent variants with percentage of cases are shown in Table 4. Efforts should be made to minimize delay in the transitions involved in these traces as they constitute path followed by majority of the closed cases.

5.

PERFORMANCE ANALYSIS

5.1

Reopen Analysis

Reopened bugs10increase the maintenance costs, degrade overall user-perceived quality of the software and lead to un-necessary rework by busy practitioners [8]. If fair number of fixed bugs are reopened, it could indicate instability in the software system [11]. Bug reopens have a significant im-portance in both open source and commercial software sys-tems. Understanding bug reopening is of significant interest to the practitioner’s community in order to characterize ac-tual quality of the bug fixing process [11]. Therefore, the analysis of reopened bugs will help process owner to take preventive actions to minimize the reopening of bugs.

For both the products, of all the labels,Wontfix is more likely to be reopened in comparison with others as shown in Figure 6a. Also there are very few cases reopened after QA-reassignment, component reassignment and developer reassignment. Shihab et. al. [8] showed that the last sta-tus of the closed bug is an important influential factor for reopening. The reasons [11] [8] for reopening of the closed bug with given label could be:

• Wontfix/Invalid/Incomplete/Worksforme: Clear steps to reproduce and additional information becomes

avail-10According to Bugzilla@Mozilla, the bug was once resolved,

(7)

S.No. Firefox Core

%age Trace %age Trace

1 14.24 Rep→Unconfirmed→Duplicate 18.71 Rep→New→Dev-R→Fixed

2 12.06 Rep→Unconfirmed→Invalid 14.34 Rep→New→Fixed

3 9.55 Rep→New→Dev-R→Assigned→Fixed 12.36 Rep→New→Dev-R→Assigned→Fixed

4 6.27 Rep→New→Duplicate 8.48 Rep→New→Fixed

5 5.21 Rep→Unconfirmed→Worksforme 6.62 Rep→New→Duplicate

6 5.13 Rep→New→Fixed 4.99 Rep→New→Worksforme

7 4.15 Rep→New→Dev-R→Fixed 2.39 Rep→New→Invalid

8 3.95 Rep→New→Worksforme 1.44 Rep→New→Wontfix

9 3.86 Rep→Unconfirmed→Incomplete 0.98 Rep→Unconfirmed→Invalid 10 2.88 Rep→Assigned→Fixed 0.94 Rep→New→Dev-R→Dev-R→Fixed

Table 4: Top 10 most frequent event traces with case frequency percentage.

0 2 4 6 8 10 Duplicate Fixed Incomplete Invalid Verified Wontfix Worksforme

Percentage of transition to Reopen

Source State

Core Firefox

(a) Percentage of issues reopened for different states. 61.15% 0.76% 5.55% 0.57% 6.41% 1.05% 10.90% 1.34% 12.06% 0.19% Fixed Dev−reassign Wontfix QA−reassign Invalid Incomplete Worksforme Verified Duplicate Comp−reassign

(b) Contribution of source activities in total Reopened for Core.

52.42% 0.61% 11.82% 0.61% 7.27% 1.51% 9.70% 1.21% 14.54% 0.30%

(c) Contribution of source activities in total Reopened for Firefox. Figure 6: Reopen Analysis

able to better understand reported bug and determine root cause. Also, if the priority of bug is underesti-mated and hence closed.

• Duplicate: A bug accidentally marked duplicate due to similar symptoms or title matching with the existing bug without proper understanding of the root cause. • Fixed: Incompletely or incorrectly fixed bugs due to

poor root cause understanding, regression bug (the bug reappears in the new version) or the boundary cases missed by developer during testing.

• Verified: Incorrectness realized later or extra informa-tion becomes available which triggers reopening of bug. Figure 6a shows percentage of bugs getting reopened for all the possible closed states. The interpretations from the figure are as follows:

• Comparatively high percentage of bugs closed with la-belWontfix,Invalid,Incompleteand Worksforme are getting reopened for Core. To improve the quality and understanding, more efforts should be made to ensure that sufficient information is retrieved from the reporter before closing. This can be done in two ways: 1. being interactive and clarifying things with the reporter after a bug has been reported, and 2. im-prove initial template to capture sufficient details be-forehand, to reduce the delay. Disagreement in the priority of bugs should be minimized by defining clear guidelines to decide priority.

• We see that the chances of getting a Duplicate bug reopened in Core (around 5%) are more than double of

Firefox (around 2%). Thorough understanding of bug should be done before marking a bugDuplicate. The decision should not be based on superficial attributes. • After Wontfix, bugs resolved as Fixed are prone to reopening in Firefox (6%) unlike Core (with Works-forme). A proper understanding of root cause is neces-sary for fixing a bug. Assumptions related to reported bug should be avoided. If unclear, the owner should ask the reporter for clarification and necessary infor-mation. Minimal required testing of fixed bugs should be performed to avoid failures at later stages. • There are very less chances of reopening for Verified

bugs (1% or less). So, the QA manager should be encouraged to verify final resolution carefully. Of all the reopened bugs, the major contribution is from Fixed,Duplicate,Wontfix,WorksformeandInvalid. Rest all constitutes very less percentage as evident from Figure 6c for Firefox and Figure 6b for Core. Since the total number of Fixed bugs is high, even if less percentage ofFixed get reopened, it contributes more towards total reopened bugs. Hence the reasons for reopening of Fixed bugs should be dealt with high priority.

5.2

Self-loop and Back-forth Analysis

It is important to study recurrent loops as they are in-dicators of deeper problems. It is difficult to detect such ping-pong patterns where a bug is passed repeatedly with-out any progress [2]. Self-loop is an edge that starts and ends at the same node (activity). It shows consecutive repetition of the same activity. Presence of loops is undesired as it indicates imperfection. Diagonal elements of the confusion

(8)

matrix shown as Table 6 denotes absolute number of loops where first entry corresponds to Core and second to Firefox. The process map obtained for Core and Firefox have loop mainly forComp-reassign,Dev-reassignandQA-reassignas observed in Table 6. ForVerified, there are 9 and 3 cases for Core and Firefox respectively which is very less and can be ignored. Most of the cases have one loop, that is, the decision made in first attempt could be imperfect and needs correction so same activity repeated two times. Maximum loops are for developer reassignment (1296/222) and com-ponent reassignment (471/257). It is not easy to decide the component to which bug pertains and the person to whom the bug should be assigned. The number of loops are more than one in some cases which means a lot of time is going in making decision for right assignment and component identi-fication. Loops cause the undesired delay which is as high as 50.25 days and 30.35 days for Firefox and Core respectively in case of developer reassignment as shown in Table 5.

Firefox Core

Dev Comp QA Dev Comp QA

Mean 50.2 12.7 13.7 32.3 12.2 13.9 Median 9.8 0.2 0.3 5.4 0.2 0.8 Min 8 s 9 s 4 s 7 s 12 s 4 s 1Q 13.0h 18.7m 1.0m 12.2h 21.6m 10.4m 3Q 65.9 2.6 19.4 28.7 2.4 6.9 Max 437.8 334.9 133.1 546.7 450.5 161.1 SD 84.8 41.5 28.3 67.2 43.4 34.5

Table 5: Duration (unit: days-default, hours (h), minutes (m), second (s)) for loops in given states where Min: Min-imum, 1Q: 1st quarantile, 3Q: 3rd quarantile, Max: Maxi-mum, SD: Standard Deviation.

Back-forth: A transition from a stateAto another state Band then back to stateAis defined as one back-forth loop. We observe back-forth phenomenon between pair of activ-ities (also referred to as a ping-pong pattern in the litera-ture) in the runtime process model. Table 6 shows results of our empirical analysis for identifying back-forth pattern. Each cell of the Table 6 matrix (except diagonal values) shows frequency of back-forth loop between activity pairs (14×14 matrix in which the row activity is state A and the column activity is stateB of back-forth loop). We no-tice thatUnconfirmed,Component-reassignment, Developer-reassignment, QA-reassignment and Reopen are activities which are frequently involved in back-forth pattern. Our experimental results show that aFixed bug being reopened and then again resolved asFixed is frequent (refer to Table 6: 466 times for Core and 114 times for Firefox) indicat-ing regression bug or imperfection. A back-forth loop phe-nomenon for Invalid, Worksforme, Duplicate and Wontfix state withReopened shows disagreement between develop-ers. One explanation can be that the team members are not convinced with initial decision and reopened the bug result-ing in loss of productivity and increasresult-ing the mean-time to repair. Back-forth betweenUnconfirmedand any resolution is permitted according to the pre-defined Bugzilla lifecycle and we also observe in as-is process.

5.3

Bottleneck Analysis

A bottleneck is defined as a specific part within the run-time process map which is relatively run-time consuming and reduces the overall performance of the end-to-end process.

Bottleneck identification (most time consuming states, ac-tivities and transitions) from an as-is process model is an actionable information for the process owner (from the per-spective of root-cause analysis and process improvement). We compute the mean-time for every transitions defined in the bug lifecycle and make the following observations:

1. Our analysis shows that for Firefox project, confirma-tion to New state takes 33.4 days (mean value). In contrast, it takes on an average 19.4 days for direct assignment (directly toAssigned state). Confirmation seems to be a bottleneck (and hence can be expedited as it is resulting in delays). We observe that for Core project, the mean time taken for confirmation toNew is 9.3 days which is relatively much less than Firefox. 2. The time taken for the transition New→Assigned is

17.1 days and 17.4 days for Core and Firefox respec-tively. This means on an average, there is a delay of around 17 days in assignment of bug fromNew state. 3. AnAssignedbug is labelled asNewin-case of a devel-oper reassignment event (restart). Experimental result shows that the time-elapsed between two states is 57.4 days for Firefox and 6.4 days for Core. This should be avoided by correct developer assignment in first at-tempt or quick reassignment should be done if wrongly assigned because one wrong assignment will delay res-olution for Firefox by 57.4 days on an average. 4. Resolution of a bug asDuplicate from possible source

state:Unconfirmed,Assigned(not for Core),Reopened andNewis taking comparatively less time for both the projects indicating efficiency in duplicate bug report identification. For Firefox, it is effective also as the chances of reopening are fairly low (Figure 6a). 5. Experimental results show that identification of cases

where resolution isWorksforme,Wontfix and Incom-plete is more time consuming irrespective of source state from which resolution is made. We believe the actionable information is that more efficiency improve-ment is required in their identification. For instance, resolution from New to Worksforme and Wontfix is taking exceptionally long duration of 19.3 week and 31.1 week respectively for Firefox and 20.5 week and 81.7 days for Core. For Core,Assigned to Wontfix is taking even more, that is, 14.8 weeks.

6. For both the projects, resolving a bug asFixed takes less time from Assigned as the developer has already taken possession and worked to fix it. When Fixed directly fromUnconfirmed orNew, it takes more time. 7. The chances ofVerifedbug getting reopened are very less and takes 8.3 and 18.9 days on an average to re-open verified bug for Firefox and Core respectively. However, for FirefoxWontfix,Fixed and Worksforme are more likely to be reopened and there is delay of 14.9, 13.9 and 45.6 days in reopening of bugs from re-spective states. For Core, there is comparatively more delay in reopening of bugs resolved as Worksforme, Wontfix andIncomplete (24.1, 18.3 and 38.1 days re-spectively) and have good chances of getting reopened. Therefore, the reasons for their reopening should be dealt with more priority.

(9)

Activity UN New CompR DevR QAR Asg Invalid ReO Wfm InC WF Dup Fixed Ver UN 3/5 22/53 3/22 5/8 1/16 13/35 /2 New 13/5 2/1 CompR /2 75/17 471/257 66/4 245/90 /1 6/7 3/3 1 2/1 7/4 6/2 DevR 3/ 66/ 92/ 1296/ 21/ 172/ 2 2/ 9/ 1 21 17 222 3 99 2 1 QAR 1 192/65 13/1 88/29 8/2 1 5 3 Asg 16/7 Invalid 13/22 10/3 1 /1 ReO 5/2 2/2 5/7 4/1 52/16 Wfm 2/9 17/8 /1 2 InC 4/4 1 /1 1 WF 4/15 9/12 /1 Dup 8/25 1/1 13/4 /1 /1 1/1 Fixed 7/ 466/ 6 3 114 Ver 9/3

Table 6: Loop and Back-forth confusion matrix where 1st entry corresponds to Core and 2nd to Firefox (blank if no occurrence).

6.

CONFORMANCE TESTING

Conformance checking aims at detection of inconsistencies between design time process model and the model obtained for as-is process from runtime event log. We define metrics to measure fitness, that is, how well observed process com-plies with the control flow defined in the design time process model and the point of inconsistency. We believe that it can be useful for identification and removal of obsolete parts which demands extra maintenance efforts [6].

We propose an algorithm 1 to evaluate fitness metric. We find number of cases with valid traces (only defined transi-tions) and its ratio with total cases to measure the extent of fitness. Event log and adjacency matrix,A(with row as source state and column as destination state, that is, 15×15 in our case) has 1 in the cell if transition is permitted other-wise 0, are given as input. We obtain the event trace (array with states from event log in sequential order of their occur-rence) for each case ID as shown in step 4 of algorithm 1. For optimization, we identify unique traces and count frequency of each. It is useful as most of the cases have same trace (refer Table 4) and we need not validate each case individu-ally. Each unique trace is verified with adjacency matrixA for conformance. If it has all permitted transitions then the valid bitViis assigned value 1 else 0. If the evaluated value of fitness metric (refer to line 21 of algorithm 1) is less than 1 then there is deviation from defined model. To detect the cause of inconsistency,inconsistentDetector() (algorithm 2) is invoked which takes event log and adjacency matrix, A as input and returns inconsistency metrics. We create foot-print matrix (refer step 10 of algorithm 2) of runtime event log and compare with design model (adjacency matrix) to identify inconsistent transitions (non-zero elements inITF). All states (15 for our case) are stored in an array,state. The frequency of transition between each pair of states is counted from event log and stored in Transition Frequency,TF ma-trix. Total inconsistent transitions are evaluated by adding the elements ofITF (in line 14 of algorithm 2). We evaluate the frequency (maximum element ofITF) and most frequent inconsistent transition in step 15 and 16 of algorithm 2.

To perform experiments, we consider only closed cases till last resolution state. Since we consider more activities than defined in bugzilla lifecycle, we define a design model (adjacency matrix) with extra states based on practitioner’s

inputs and our understanding. The value of fitness metric, FM using algorithm 1 is 0.86 and 0.91 for Core and Firefox respectively. It clearly shows that Firefox has high con-formance as compared to Core. However, both the projects have good compliance with defined design model. 818 traces out of 1164 total unique traces are valid for Core. Similarly, 403 traces out of 622 are valid for Firefox which implies that invalid traces usually have low frequency. As theFM<1 for both the projects, we analyze cause for non-conformance, that is, detection of undefined transitions happening in run-time process. The proposed algorithm 2 is invoked result-ing in 2412 and 739 total inconsistent transitions for Core and Firefox respectively. Surprisingly, the most frequent in-consistent transition is Reported → Assigned for both the projects with frequency of 1643 and 305 respectively.

7.

CONCLUSIONS

The runtime process map (reality or actual process model) derived from process mining event logs of 12234 Firefox and 24253 Core issue reports, reveal state transitions and event traces which are common and also the traces which are in-consistent with the design time process model. Empirical analysis reveals the distribution of 15 different activities and we infer that a statistically significant percentage of bug re-ports undergo component, developer and QA manager reas-signment. Data analysis shows certain infrequent transitions (such asVerifiedtoReopenandAssignedtoNew) which can be used as guide to correct issues with the runtime process. We infer that majority of the bug lifecycle has 3−4 events while several bug reports have more than 7 events in lifecy-cle. The number of unique traces from start to finish is large (1164 and 622 variants for Core and Firefox respectively) and the top 2 unique trace constitutes more than 25% and 33% of the Firefox and Core bug reports respectively. Em-pirical analysis reveals the extent of transition from various states (such asWorksforme,Wontfix, Verified,Invalid, In-complete,Fixed andDuplicate) toReopened and shows that severalwontfix andworksforme bug reports are getting re-opened. We observe self-loop and back-and-forth transitions (undesirable and inefficient) and notice a back-forth pattern for the states: Unconfirmed, Comp-reassign, Dev-reassign, QA-reassign andReopened. We identify bottlenecks to help process owner for improvement. We propose a conformance

(10)

Algorithm 1evaluateFitnessMetric() Require: Event log, Adjacency matrixA Ensure: Fitness measure

1: whilenot at the end of Event logdo 2: ∀entries with Case IDi

3: if tsC> tsB> tsA> tsReportedthen 4: Trace,Ti={Reported,A,B,C}; 5: end if

6: end while

7: Count frequency of each unique traceU TiasFi; 8: while∃U Tido

9: m:= size of trace;

10: p:=U Ti[1];q:=U Ti[2]; valid bit forU Ti,Vi= 1 ; 11: whilep < mdo 12: if A[p][q] == 1then 13: p++; 14: q++; 15: else 16: Vi= 0; 17: break; 18: end if 19: end while 20: end while

21: Calculate Fitness metric: F M =

PN

i=1(Fi∗Vi)

PN

i=1(Fi)

whereN=Number of unique traces. 22: if FM<1then

23: inconsistentDetector(Event log,Adjacency matrix A); 24: else

25: No inconsistency; 26: end if

checking algorithm (between the design time and the run-time model) and the value of fitness metric discovered is 0.86 and 0.91 for Core and Firefox respectively.

8.

ACKNOWLEDGEMENT

The work presented in this paper is supported by Prime Minister’s Fellowship awarded to the first author. The au-thor would like to acknowledge SERB, CII and Infosys Lim-ited for their support. The authors would like to thank Dr. Srinivas Padmanabhuni for his valuable inputs.

9.

REFERENCES

[1] B. Akman and O. Demirors. Applicability of process discovery algorithms for software organizations. In

Euromicro Conference on Software Engineering and

Advanced Applications. SEAA ’09., pages 195–202, 2009.

[2] Christine A Halverson, Jason B Ellis, Catalina Danis, and Wendy A Kellogg. Designing task visualizations to support the coordination of work in software development. In

Proceedings of Computer supported cooperative work, pages

39–48. ACM, 2006.

[3] Ekkart Kindler, Vladimir Rubin, and Wilhelm Schafer. Activity mining for discovering software process models. In

Software Engineering, volume 79 ofLNI, pages 175–180.

GI, 2006.

[4] Patrick Knab, Martin Pinzger, and Harald C. Gall. Visual patterns in issue tracking data. InProceedings of New modeling concepts for today’s software processes: software

Algorithm 2inconsistentDetector() Require: Event log, Adjacency matrixA Ensure: Inconsistent transition metrics

1: Array of states,state={state1, state2...statef};

2: fori=state[1] :state[f]do 3: forj=state[1] :state[f]do 4: count=0;

5: whilenot end of Event logdo 6: if transition, i→jthen

7: count++;

8: end if 9: end while

10: Transition Frequency,T F[i, j]=count; 11: end for

12: end for

13: Inconsistent Transition Frequency matrix, ITF: = (T F−T F◦A)

14: Total Inconsistent Transition: =X(T F−T F◦A)

15: Highest frequency of inconsistent transition: =max(T F−T F◦A)

16: Most frequent inconsistent transition: =argmax(T F−T F◦A)

process, ICSP’10, pages 222–233, Berlin, Heidelberg, 2010.

Springer-Verlag.

[5] W. Poncin, A. Serebrenik, and M. van den Brand. Process mining software repositories. InEuropean Conference on

Software Maintenance and Reengineering (CSMR), pages

5–14, 2011.

[6] Anne Rozinat and Wil MP van der Aalst. Conformance checking of processes based on monitoring real behavior.

Information Systems, 33(1):64–95, 2008.

[7] Vladimir Rubin, Christian W. G¨unther, Wil M. P. Van Der Aalst, Ekkart Kindler, Boudewijn F. Van Dongen, and Wilhelm Sch¨afer. Process mining framework for software processes. InProceedings of the international conference on

Software process, ICSP’07, pages 169–181. Springer-Verlag,

2007.

[8] Emad Shihab, Akinori Ihara, Yasutaka Kamei, Walid M Ibrahim, Masao Ohira, Bram Adams, Ahmed E Hassan, and Ken-ichi Matsumoto. Studying re-opened bugs in open source software.Empirical Software Engineering, pages 1–38, 2012.

[9] Wikan Sunindyo, Thomas Moser, Dietmar Winkler, and Deepak Dhungana. Improving open source software process quality based on defect data mining. InSoftware Quality.

Process Automation in Software Development, volume 94

ofLecture Notes in Business Information Processing, pages

84–102. Springer Berlin, 2012.

[10] AJMM Weijters, Wil MP van der Aalst, and AK Alves De Medeiros. Process mining with the heuristics miner-algorithm.Technische Universiteit Eindhoven,

Technical Report WP, 166, 2006.

[11] Thomas Zimmermann, Nachiappan Nagappan, Philip J Guo, and Brendan Murphy. Characterizing and predicting which bugs get reopened. InInternational Conference on

Software Engineering (ICSE), pages 1074–1083. IEEE,

References

Related documents

Consequently, we have delineated ways in which a perspective of intimate relationships including individual factors and dyadic development might guide the enhancement of

As seen in Table (10), there are statistically significant differences for the level of job satisfaction among the faculty members in accordance to the variable of university,

Lima bean, Phaseolus lunatus (secondary center) 3.. Common bean, Phaseolus vulgaris

Adrian was retained to work with the Head of Knowledge Development to review the CBI information stores and its knowledge management processes and to design a new infrastructure

Open door policy Reach everyone Invite families Class family Teacher presence Event Publicity Student involvement/Ownership Opportunities School events At home activities

The current study represents a systematic effort to explore how transactional and transformational leadership style influence employee’s engagement and turnover

The proportion of filings against financial sector companies remained roughly constant: between 2006 and 2009, eight financial sector cases represented approximately 30% of all