Development Productivity for Commercial Software Using Object-Oriented Methods

(1)

Development Productivity for Commercial

Software Using Object-Oriented Methods

Tom Potok

Mladen Vouk

1. Abstract

Object-oriented software development is widely believed to improve programmer productivity, however, surprisingly little evidence of this has been published. This paper presents an initial analysis of the productivity measured for 19 software products developed at IBM Programming Laboratories. Half of the products were developed using object-oriented methods, while the other half were developed using traditional “procedural” methods. Our study indicates that first generation object-oriented projects achieve the same productivity rates as follow-on releases of procedural projects. There does not appear to be a productivity penalty for moving to object-oriented technology, but there is apparently no immediate productivity benefit either. Furthermore, for this data, productivity rates increase as project size increases. This runs against the "conventional wisdom" about large projects, and is more surprising given that smaller projects had proportionately larger development teams than larger projects. An initial study of this phenomenon indicates that scheduling effects, such as the Parkinson effect and the Deadline effect, may be responsible for this economy of scale.

2. Introduction

There are many who believe that object-oriented software approach is the next step in the evolution of software development. One of the reasons is that object-oriented development has great potential for increasing programmer productivity.

However, there is little quantitative evidence that real-life productivity of object-oriented software development is indeed consistently

better than that of "classical" or "procedural" software development.

For example, Lewis et al. performed an experiment with undergraduate software engineering students where they studied four cases: use of procedural development without reuse, use of object-oriented development without reuse, use of procedural development with reuse, and use of object-oriented development with reuse [12]. From this experiment, Lewis et al. concluded that, through reuse, the object-oriented paradigm substantially improves productivity. Similarly, Melo et al. conducted an experiment with graduate students that tested the hypothesis that a high level of reuse resulted in a higher productivity rate [13]. They reported that there is a linear relationship between productivity, and reuse rates. This is based on seven projects ranging is size from 5000 - 25000 lines of code, all with varying rates of reuse. In another study, Boehm-Davis et al. report on a comparison of Jackson program design, object-oriented design, and functional decomposition [3]. Three groups of professional programmers, who were familiar with the methodologies, volunteered to develop designs for three experimental problems. Various metrics were recorded, and it was found that different methodologies produce different solutions to the same problems, and that a methodology may cause a reduction in design time, and complexity of a design.

There are many other studies concerned with the value of object-oriented approach, but very few focus on productivity for software developed in commercial environments by professional programmers who use object oriented methods. In fact, it would appear that most organizations that develop software may not measure software reuse, and the associated productivity [8], and may have only superficial and anecdotal

(2)

information about the effects object-oriented technology has in their environment.

In this paper we present initial results of a comparative evaluation of object-oriented and procedural development productivity in a commercial environment. In the next Section we give some background information. In Section 4 we present the empirical productivity data. In Section 5 we discuss our data in the context of a milestone-driven software process. The summary is in Section 6.

3. Background

A software product is a commercially available software system that includes the packaged software, documentation, and support. It is assumed that during the development the software product is properly analyzed, designed, implemented and tested, and its quality certified prior to release. The first version of a software product we call the first release (or initial release), subsequent versions are called follow-on releases. A software product schedule is the schedule that directs its development from the initial planning stages, through the final product shipment. The significant schedule dates are called milestones. We call a software process where progress is primarily driven by pre-set milestones (schedule-based, or based on some other criteria) a milestone-driven software process

[7]. A total project team is a group of people

who work closely together to create and deliver a successful software product. This team is made up of programmers, testers, managers, writers, planners, designers, human factors experts, or whatever skill is needed to create and deliver a successful product.

3.1 Object-Orientation

There are three basic phases in developing software using object-oriented methods. The first is the analysis phase producing an analysis model, the next is the design phase producing a design model, and the final phase is implementation, producing executable software. Object-oriented Analysis (OOA) is the process of defining a model that solves a given problem or meets a set of requirements. This model is in terms of a real world problem space. It consists of an object model, and may include a process model, or state model. OOA may include requirement's analysis, identification of real world objects, and the definition of the interaction

among objects. An OOA model may be expressed informally in diagrams or text, or more formally in entity-relationship (ER) diagrams, data-flow diagrams, state-transition diagrams, Petri nets, or a variety of specialty representation currently available [10, 11, 14, 16].

The object-oriented design (OOD) phase is a transformation from the analysis model into design descriptions. The design phase may include decomposing the analysis level objects, evaluating the use of existing class libraries, and iteratively creating a hierarchy of the decomposed objects. The decomposition of the analysis model is ideally a step-wise refinement, yielding a strong mapping from the analysis model to the final system [5, 11, 14, 16, 17].

Object-oriented implementation is a refinement of the design model into executable software. The languages most often used are Smalltalk, and C++.

Object-orientation may increase productivity because OOA and OOD, if done correctly, result in products that are cleaner, more understandable, and easier to implement and test, and because re-use, particularly in follow-on releases, may reduce the maintenance and enhancement effort.

3.2 Productivity

We define average productivity of a total software programming team by the following relationships [1, 2]:

Team Productivity = ProjectSize

ProjectDuration (1)

Let a team consist of N persons, let the project

size be in thousands-of-lines-of-code (KLOC), and let the project duration be in months. The average personal productivity is the team productivity divided by N. Equation (1) can be rearranged to calculate the effort (in person-months) expended on a project as follows:

Effort = ProjectSize

PersonalProductivity (2)

The above equation implies a linear relationship between the effort and the size of a project. However, a number of studies have shown that programmer productivity is not constant over the project and is a function of many factors including product size (see [2] and references therein). The following equation describes a general relationship between project effort, and size:

(3)

Person Months = α(KLOC)β (3)

The common values for α and β found in the

literature range from 0.7 to 28 person-months for

α, and 0.9 to 1.8 for β. Of course, only certain

combinations of α and β values occur and are

dependent on the type of software, programmer experience, technology, etc. The COCOMO model is the one of the best know instantiations of this relationship [2]. The most common experience is that the larger the product, the lower the productivity on that project. The

sub-linear form of the model (β<1), which

implies economy-of-scale, i.e., the productivity increases as larger products are developed, is less common, but it has been reported ([6], also see [2] and references therein).

The parameters α and β can be determined in a

number of ways, but frequently it is by regression on the linear version of this model:

ln(Person Months) = ln(α) + β ln(KLOC). (4)

1.5 3.0 4.5 6.0

ln(KLOC)

ln(Person-Years)

Procedural Object Oriented Object Oriented (port)

Figure 1. Project effort vs. project size.

4. Productivity Study

4.1 Projects

We examined 19 commercially available software products. Eleven were developed using object-oriented methods, and eight using traditional procedural methods. Four of the object-oriented products were ports of software from one platform to another. All products were

developed by IBM. Five were developed for mainframe use, and fourteen for workstation use. The products range in size from about 1 thousand (KLOC) to about 1 million lines-of-code. Some of the collected information is shown in Table 1. The presented information includes the size (KLOC) of the code developed in-house, project development duration (calendar months) and

staffing level (average number of persons). This

level represents the number of people on a total team, not just the programmers. For a small project it is not unusual for half the total team to be non-programmers. It must be stressed that the project size reflects only the code developed in-house, and that, for small projects, this code was often coupled with code developed elsewhere, but that the team size reflects the effort to procure, evaluate and integrate the two codes.

4.2 Analysis

In Figure 1 we plot the logarithm of the product

effort versus the logarithm of the product size, see equation (4). Procedural projects are marked

with a "+" symbol, object-oriented by "

•

", and

object-oriented ports by "x".

We see the familiar growth in effort with the project size [2], but, excluding the ported software, there are no obvious differences between the procedural and object-oriented products. However, there is a significant difference between the ported and non-ported

(4)

products. Porting software can be viewed as software development where all of the design, and significant parts of the code are reused, and, although it should not be confused with re-use it offers a hint of the effort reduction that can be obtained through re-use. While ports are generally less costly than development of new software [15], and in itself the observed difference between the ported and other categories is not unusual, it is important to remember that the object-oriented approach inherently provides a mechanism for re-use not only when software is ported, but also for enhancements and in development of future releases. Hence the productivity gains observed during the port of object-oriented software may be a good reflection of the possible gains the technology offers.

Table 1. Project data.

Product Release1KLOC Duration

Months Avg. Team Method Project 1 2.0 1.8 7.0 12.1 Proc Project 2 2.0 1.1 8.0 10.1 Proc Project 3 2.0 59.3 21.0 62.0 Proc Project 4 2.0 25.7 15.0 18.2 Proc Project 5 2.0 208.5 15.0 48.9 Proc Project 6 2.0 54.0 15.0 39.5 Proc Project 7 2.0 52.0 9.0 51.6 Proc Project 8 1.0 1093.3 27.1 60.5 OO Project 9 2.0 51.9 11.0 36.0 OO Project 10 2.0 5.0 8.0 38.9 OO Project 11 2.0 2.0 11.0 12.5 Proc Project 12 2.0 2.0 8.0 15.9 OO Project 13 1.0 64.0 16.0 8.1 OO-Port Project 14 2.0 40.0 14.0 7.5 OO-Port Project 15 1.0 112.0 16.0 9.3 OO-Port Project 16 2.0 80.0 14.0 7.6 OO-Port Project 17 1.0 130 20.0 36 OO Project 18 1.0 38 19.0 39 OO Project 19 1.0 6 23.0 13 OO

To further examine the issue we used the following regression model on the non-ported data:

Y = α + β₁X1 + β2X2 + β3X1X2 + ε (5)

where Y is the ln(Person-Months), X₁is the

ln(KLOC), and X₂is a class variable that

indicates the development method. This relationship provides a single regression model for the full data, and a means of testing whether two models are needed. We found that the model was sub-linear, and that, based on F-test of the

1_{Release 1.0 indicates that initial release, 2.0}

indicates a follow-on release, not strictly a second release.

hypothesis thatβ2andβ3are both equal to zero,

one regression equation is sufficient to model

both procedural, and object-oriented products2 that

were in the non-port category. This indicates that, in this case, there is no statistically significant difference between the productivity of object-oriented software development and procedural software development. As expected, a similar evaluation of the data on ported software vs. other software shows a significant difference between the two.

4.3 Discussion

The effort expended in object-oriented software development does not appear to be different from that of procedural development. This suggests that in this environment the introduction of

object-oriented technology is not detrimental to productivity as the introduction of new technology often is. This is good news.

However, it is somewhat disappointing that there is apparently no significant productivity gain from the object-oriented technology.

Of course, many factors are at work. Current analysis does not account for productivity drivers other than project size, so it is possible that productivity gains are confounded by other project characteristics, such as deadlines and generational issues. For example, in our case, all procedural projects were follow-on releases, while four of the seven regular object-oriented projects were first releases. Generally, first generation projects have a somewhat lower productivity than later generation projects. Unfortunately, analysis using a generational class variable did not add additional insight. Also, the sample size of the two product groups was small, and a larger sample is needed before firmer conclusions are made.

However, one factor that we believe has a significant role is the milestone-driven character of the process used to develop the products. The models of the type given in equations (3) and (4) assume that the project is driven by product and programmer characteristics. Extra effects, such as the cost of the requirements planning, documentation, procurement and integration of software into the project, are usually not accounted for. However, practice shows that both schedule deadlines and third-party software are probably the most important modifiers of software production. For example, a few years

2 The assumptions of constant variance for the two individual models was verified.

(5)

ago it was not unusual for a competitive product to take two to three years to complete. In recent years, marketing pressure has reduced this development cycle from 12 to 18 months, and most recent trends are towards a 9 month cycle. If the schedules drive productivity, then a project team’s productivity may vary greatly depending on how aggressive or lax the project schedule is. Apart from product characteristics (complexity, language, etc.), the team ability, and the technology, there are two process related effects that strongly impact schedules: Parkinson’s Law, and the Deadline Effect [2, 9]. Parkinson’s Law states that work will expand to fill the allocated time. For example, if the same project is given to three similar development teams with three easily achievable, but different, deadlines the projects will usually not complete at the same time, but according to the deadlines set. The Deadline Effect occurs when programmers are compelled to work extra time to complete a task by a given deadline. If a deadline is set, and there is a strong pressure to meet the deadline, people will work additional hours solely to meet the deadline [4,7].

An examination of the project histories has shown that, like in many other organizations, one of the important drivers was indeed the need to comply with pre-determined product delivery schedules. This means that the development was driven by schedule milestones. In some instances, schedules were aggressive, in other they were more permissive. We have analyzed schedule profiles of three projects in detail.

First Generation Object-Oriented

0 200 400 600 800 1000 1200 1400 1600 1800 2000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 Milestones Days Plan Actual

Figure 2 First generation object-oriented

project. The actual (dotted line) and planned (solid line) project duration vs. milestone number.

Two of these projects used object-oriented software development methods, with the third using procedural methods. The first object-oriented project is an initial release, the second is a follow-on release, and the procedural project is also a follow-on release. The plots shown in Figures 2, 3 and 4 illustrate the relationship between the actual project completion times, and the original project schedules for these three projects. It is clear that, in all three cases, the project plan tracks the actual project duration quite closely. How was that done?

Second Generation Object-Oriented

0 50 100 150 200 250 300 350 400 450 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 Project Milestones Leng Plan Actual

Figure 3 Second generation object-oriented

project. The actual (dotted line) and planned (solid line) project duration vs. milestone number.

Procedural 0 100 200 300 400 500 600 700 1 2 3 4 5 6 7 8 9 10 111213 141516 1718 1920 Project Milestones Days Plan Actual

Figure 4 Procedural project. The actual (dotted

line) and planned (solid line) project duration vs. milestone number.

First of all, there was a very wide variation in the average productivity for these three projects. The procedural project had the lowest productivity, the first generation object-oriented project had a productivity rate about nine-times greater than the procedural project, while the second

(6)

generation object-oriented project had a productivity rate about four times the first

project. This productivity range is not unusual,

but may be a reflection of using line-of-code as an effort metrics. Often on small projects the total team will be involved in more non-project related activities, than larger project are. These activities include user groups, trade shows, product support, and marketing support. Conversely, a larger project on a tight schedule may limit any non-project related activity to a minimum. Two project may show vastly different productivity rates, with both development teams working equally hard. One team may focus strictly on project related activities, while the other does not. We hope that at least some credit can be given to the object-oriented methodology, but it is unlikely that the productivity rate of these programming teams was so well known in advance in all three cases that the original project schedule estimates would do such a surprisingly good job at projecting the actual completion times. It is obvious that control was exercised over the development process to achieve the milestone compliance. The explanation that schedules were met because software functionality was changed or testing time was reduced to meet them, does not hold in this case. Examination of the project records shows that no major functions were added or deleted in these projects, and time was not saved by shortening testing cycles. For example, Figure 3 and 4 projects showed delays in the 7th through 9th milestones, and 6th through 9th milestones respectively. These were coding milestones, and the schedules were brought back in line during this phase, not during testing. Figure 3 project also had delays in milestones 15 and 16 that were testing milestones. Milestone 15 started on schedule, and milestone 16 ended on schedule. This testing effort may have been shifted, but it was not shortened.

The most likely explanation is that individual and team productivity was actively controlled over the development life-cycle in order to meet the pre-determined schedule milestones. It is well known that productivity is higher for an aggressive schedule, and lower for a lax schedule. For an aggressive schedule the deadline pressures require people to work over-time and weekends to meet project milestones, thus raising overall productivity rates. For lax schedules, Parkinson’s effect will fill programmers working hours with legitimate work which, however, may not be directed at early completion of the project at hand, thus lowering the apparent overall productivity rate. It is also well know that

co-processes, such as development of software components at two different sites, or procurement of one part of the product from a third party may be a bottleneck, and may appear to lower productivity.

Programmer/KLOC vs Project Size

0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 ln(Project Size)

Figure 5 Project staffing normalized to project

size.

Sizve Vs Team Productivity

0 2 4 6 8 10 12 0 1 2 3ln(KLOC)4 5 6 7 l (T

Figure 6 Team productivity vs. project size.

Further analysis seems to supports this explanation. Figure 5 shows the number of persons per KLOC vs. the project size. Apparently, there were more people per KLOC on small projects than on larger projects. This is not so surprising, since it is usually harder to fully staff a larger project than a smaller project. Furthermore, smaller projects may be part of an enhancement effort or may be part of a larger on-going effort and so smaller projects will appear overstaffed, when actually they reflect true personnel load needed to deliver the complete product (plan, develop, integrate, test, document, manage, etc.).

Nevertheless, what appears to run against the "common wisdom" are the productivity data. Figure 6 shows that, as the size of the project increased, the team productivity rate also increased. This contradicts observations often

(7)

reported in the literature, namely that software development suffers from dis-economy of scale so that productivity decreases as project size increases. Note that there is no obvious difference between object-oriented and procedural projects, and that the same growth in the productivity was observed in both categories.

One explanation is that larger projects were understaffed, and restricted to a shortened development cycle. These constraints forced aggressive schedules, eliminated Parkinson effect, and forced higher productivity rates (Deadline Effect). On the other hand, smaller projects were probably fully staffed (perhaps even overstaffed in some cases) with schedules that could comfortably be met within a given development cycle.

Other explanations are possible, and the issue is still under investigation. For example, Elmaghraby et al. reported similar economy-of-scale effects for projects that involved co-development of software and hardware [6]. hence, on smaller projects the overall slow-down may be due to co-processes, such as participation of the staff in one or more other projects and procurement of third party software, that generate an effect similar to the Parkinson effect.

5. Summary

An analysis of commercially developed software shows no significant difference between the productivity observed for object-oriented and procedural software development. The reason is not clear. However, while it is comforting that introduction of object oriented technology does not penalize, it is disappointing that the productivity on the object-oriented projects is not better than the productivity on procedural projects. Initial investigation appears to indicate that scheduling effects may be in part responsible for this. It appears that milestones, and possibly co-processes, drove the productivity, and, in fact, that larger projects, which had relatively fewer people per KLOC working on them, exhibit better productivity than smaller projects. The hypothesis is that aggressive schedules seen on larger projects increase overall productivity rate, while more liberal schedules, for a given methodology and co-processes, may be counter productive. Analysis continues.

5.1 Acknowledgments

We would like to thank Al Zollar, Dan Blum, and the IBM Software Solutions Laboratory

(Research Triangle Park – RTP) for their strong support of this research, and Paritosh Dikshit of NCSU for his assistance with the statistical analyses. Work was supported in part by IBM Canada (CAS, Toronto), by IBM RTP, and by the IBM SUR program.

5.2 About the Authors

Tom Potok is spending 1995 on a one year resident study at North Carolina State University to complete his Ph.D. in Computer Engineering. His research is on modeling the processes used to create object-oriented software. He has successfully lead the Software Solutions RTP lab in achieving ISO 9000 certification. Prior to this, he lead a team in creating an object-oriented data model designed to work with CASE tools to improve application development and quality. He has lead, and been a member of various other software development efforts. He has a BS in computer science, MS in computer engineering from NCSU. He has authored 9 publications, and has filed 2 patents. He can be reached at the following address: IBM Corporation, Research T r i a n g l e P a r k , N C , U S A . E - m a i l : p o t o k @ c a r v m 3 . v n e t . i b m . c o m , o r [email protected].

Mladen A. Vouk received B.Sc. and Ph.D. degrees from the University of London (UK). He has extensive experience in both commercial software production, and academic computing environments. He is the author, or co-author, of over 100 publications. He is currently an Associate Professor of Computer Science at North Carolina State University. During 1994/95 he is on leave at MCNC NC Supercomputing Center. His research and development interests include: software process and risk management, software testing and reliability, issues related to development of large numerical and scientific software-based systems, and computer-based education. He regularly teaches courses in software engineering, software testing and reliability, software process and risk management and programming language C. Dr.Vouk is the chairman of the IFIP Working Group 2.5 on Numerical Software. He is also a senior member of IEEE, and a member of IEEE Reliability, Communications and Computer Societies, and of IEEE TC on Software Engineering, ACM, ASQC, and Sigma Xi. He is an associate editor of IEEE Transactions on Reliability. He can be reached at the following address: Computer Science Department, North Carolina State University, Raleigh, NC, USA. E-mail: [email protected].

(8)

6. References

[1] Badiru, A. B., Pulat, P. S. (1995).

Comprehensive Project Management: Integrating Optimization Models, Management Principles, and Computers,

Prentice-Hall, Inc. Englewood Cliffs, NJ.

[2] Boehm, B. W. (1981). S o f t w a r e

Engineering Economics, Prentice-Hall, Inc.

Englewood Cliffs, NJ.

[3] Boehm-Davis, D. A. and Ross, L. S.

(1992). "Program Design Methodologies and the Software Development Process,"

International Journal of Man Machine Studies (UK) Vol. 36, No. 1., P1-19.

[4] Borger, D. S., and Vouk, M. A. (1991)

“Modeling the Behaviour of Large Software Projects,” Center for Communications and

Signal Processing Technical Report TR-91/19, NCSU.

[5] deChampeaux, D. Lea, D. Fauve, P.

(1992). "The Process of Object Oriented Design," Proceedings of the Conference on

Object-oriented Programming Systems, Languages and Applications, 1992 (OOPSLA 92).

[6] Elmaghraby, S.E., Baxter, E.I., Vouk,

M.A. (1995). “An Approach to the Modeling and Analysis of Software Production Processes,” International

Transactions in Operational Research, Vol. 2, No. 1., P117-135.

[7] Fairley, R.E., (1985). S of t war e

Engineering Concepts, McGraw-Hill Books

Company, New York, NY.

[8] Frakes, W.B. and Fox, C.J., (1995).

"Sixteen Questions about Software Reuse,"

CACM, Vol. 38(6), 75-87.

[9] Gutierrez, G.J. and Kouvelis P.,

"Parkinson's Law and Its Implications for

Project Management," (1991). Management

Science, Vol. 37(8), 990-1001.

[10] Hayes, F. and Coleman, D. (1991). "Coherent Models for Object-Oriented Analysis," Proceedings of the Conference

on Object-oriented Programming Systems, Languages and Applications, 171-183.

[11] Henderson-Sellers, B., and Edwards, J. M. (1990). "The Object-oriented Systems Life Cycle," Communication of the ACM Vol.

33, No. 9, 142-159.

[12] Lewis, J. A., Henry, S. M., Kafura, D. G. (1991). "An Empirical Study of the Object-Oriented Paradigm and Software Reuse,"

Proceedings of the Conference on Object-oriented Programming Systems, Languages and Applications, 184-196.

[13] Melo, W. L., Briand, L. C., Basili, V.R. (1995). “Measuring the Impact of Reuse on Quality and Productivity in Object-Oriented Systems,” Technical Report, University of

Maryland, Dep. Of Computer Science, Jan. 1995, CS-TR-3395.

[14] Monarchi, D. E., and Puhr, G. I. (1992). “A Research Typology for Object-Oriented Analysis and Design,” Communication of

the ACM Vol. 35, No. 9, 35-47.

[15] Vouk M.A. (1984). "On the cost of mixed language programming", ACM SIGPLAN

Notices, Vol. 19(12), 54-60.

[16] Wirfs-Brock, R., and Wilkerson, B. (1989) “Object-Oriented Design: A Responsibility-Driven Approach,” Proceedings of the

C o n f e r e n c e o n O b j e c t - o r i e n t e d Programming Systems, Languages and Applications, 1989 (OOPSLA 89), 71-75.

[17] Yuan, G. (1992) "An Evaluation of Object-o r i e n t e d A n a l y s i s a n d D e s i g n

Methodologies" IBM Technical Report