The persistence of similar problems in software development since the 1960s, as well as common observations that the vast majority of software projects are typically late and overbudget, suggests that the field of software engineering has not made much progress. But any close look at the complexity, quality, and breadth of software systems available today, for devices ranging from supercomputers to smartphones, demonstrates that software engineering has indeed made remarkable advances. In addi-tion to much more sophisticated and easy to use software applications for various devices as well as the Internet, there are now far better programming languages, code libraries and other reusable modules, and support tools available. In addition, we know a lot more today than we did in the past about how to run software projects and software businesses.
Nevertheless, software development remains very much dependent on individual talents for prob-lem solving, design, and creativity, leaving considerable room for managerial discretion. Moreover, not all individual programmers or software project managers incorporate best practices. The devel-opment challenge is often further complicated in that many users do not know what they want until they see part of a working system or a prototype. Consequently, software managers and individual
83
Software Development:
Management and
Business Concepts
83.1 High-Level Process Concepts ...83-2No “One Best Process” • What to Emphasize Most
83.2 Innovation and Design Strategy ...83-4
Level of Control • Risk Management • Late Changes • Economies of Scope
83.3 Architecture Strategy ...83-6
Importance of Modular Designs • Long-Term versus Short-Term Trade-Offs
83.4 Team Management ...83-8
Problem of Large Teams • Teamwork Principles
83.5 Project Management ...83-9
Divide and Conquer • Individual and Project Discipline • Infrastructure Investments
83.6 Quality Assurance ...83-10
Building in Quality • You Can Never Have Too Many Testers • Continuous Process and Product Improvement
83.7 Results from Project Surveys and Conclusions...83-12
Global Differences in Practices • Links between Practices and Performance Metrics • Regional Differences in Performance
References ...83-14
Michael A. Cusumano
MIT Sloan School of Management
engineers still need to think about how to manage the process of software development and how soft -ware engineering skills can contribute to organizational success.
One reason why software projects do not always apply best practices seems due to ongoing disagree-ments about what approaches are most effective in different contexts. Some projects require more invention and innovation, as well as trial and error. Invention and innovation are usually more diffi -cult to manage and predict compared to routine work. Some customers have mission-critical reliability requirements, and others do not. But, whatever the situation, in order to be more systematic, software managers and engineers need to avoid treating all software projects as unique events.
At the same time, even though many projects may have much in common, the reality is that software developers still need to adapt their techniques and processes to the needs of different contexts. For exam-ple, companies that offer mass-market products or Internet services like Microsoft, Adobe, Google, and Salesforce.com need to anticipate general user needs and release products or features relatively quickly. Professional services companies like IBM, Accenture, and Infosys, whether they are building or enhanc-ing custom systems or hybrid solutions, have to work closely with individual clients and understand these distinct relationships as well as their customers’ technical and business requirements. Other kinds of software producers, such as those creating complex, real-time systems for defense or space applica-tions, may have to invent as they go along and try to schedule what truly is rocket science.
In short, this chapter argues that there is no one best way to develop software and manage projects for all kinds of applications and contexts. It is, therefore, natural that the field of software engineering has generated disagreements over what is the best way to manage the development process. Be that as it may, some basic principles are useful to apply to a variety of projects: the need to have strategies for high-level process management, innovation and design, system architecture, team management, project manage-ment, and quality assurance. The focus of this chapter is on these general principles, through the lens of my personal experience as a researcher, teacher, and consultant.*
83.1 High-Level Process Concepts
When beginning a software project, rather than trying to plan out schedules and features in detail, managers and engineers should begin by deciding what process is most appropriate for their particular development effort (Cusumano et al. 2009). When making this decision, I have found three observations to be particularly useful. One is the recognition that all projects of more than a handful of people need some well-defined structure to keep everyone in synch (such as how and when to check in components, add features, fix bugs, or ship to the customer). This structure should be repeatable as well as adjustable across different phases or milestones of the same project and across multiple similar projects. Second, the structure should fit the particular technical characteristics and requirements of the product or system being built. This requires some extensive conversations with the customer or customer representatives as well as an understanding of the experience of the team. Third, the structure should fit the market and the business context of the project. This requires understanding the technology and competition as well as customer needs and goals.
83.1.1 No “One Best Process”
Many companies have tried to define a standard development process for their entire organization in an attempt to improve quality, predictability, and perhaps productivity or cost control. One approach should make it easier to introduce a repeatable process, which the Software Engineering Institute (SEI) at Carnegie Mellon University has emphasized since the mid-1980s (Humphrey 1989). While a repeatable process is highly desirable, however, one process for every project or department is usually not. Again, software products and custom (bespoke) projects can differ greatly according to the application and the
market and due to unique customer requirements. It follows that software development projects even within a single organization need to define different processes or variations for their different needs.
The market and business strategy introduces complexity into process choices, though we can map these out across two dimensions: the level of uncertainty in requirements and thus the need for feedback during development versus the number of releases (i.e., chances to get it right) the team can expect in the future (Figure 83.1). For example, we have new applications where developers know little about what customers want and so feedback from early prototypes or other mechanisms is critical during develop-ment. In some cases, rapid prototyping, where engineers evolve the requirements and an actual working system side by side with the customer, may be the best approach. However, if user requirements are well understood, as in the case where a team has built several similar systems before, and managers do not expect there will be multiple releases in the future where engineers can gradually improve the system (such as in embedded satellite software where uploads of new software are difficult), then a carefully documented “waterfall” approach may be best. In a waterfall type of process, development generally proceeds in care-ful, documented, sequential phases, from concept to detailed design to coding and then to multiple levels and cycles of testing and debugging. In other contexts, again depending on the level of uncertainty in requirements and the number of expected releases in the future, different styles of “agile” or “iterative” devel-opment seem most appropriate (Boehm 1988; McConnell 1996; Larman and Basili 2003). In these kinds of projects, developers add functionality in small increments as well as over multiple versions of the system.
83.1.2 What to Emphasize Most
It follows that depending on a company’s business objectives (or organizational goals in the case of a non-commercial entity, such as a government department), as well as the familiarity with the application domain and customer requirements, software managers need a different high-level process strategy that leads to different development approaches. The process should vary in terms of how much emphasis to place on practices like writing as complete a spec as possible before coding or applying a “full-court press” in qual-ity assurance (such as intensive design reviews, formal code reviews, and rigorous testing at multiple levels and stages of the project). In mission-critical projects with extremely high reliability requirements, projects generally need extensive specification and architecture work before coding, though still leaving room for feedback and evolution of the spec, such as in the user interface (Fagan 1976; Gilb 1988; Humphrey 1989).
In some cases, such as for mass-market consumer applications, speed to market and innovation may be more important than product reliability. If a firm waits to deploy the latest technology or create a
User feedback during project Uncertainty in requirements Early/often high Late/occasional low Rapid prototyping XP Iterative or “agile” methods Incremental Traditional waterfall approaches
1 Number of cycles or releases Many
FIGURE 83.1 Spectrum of process approaches in software development. (Reprinted from Cusumano, M.A.,
zero-defect product, it may never deliver in time to beat the competition. On the other hand, if a firm targets the mass market, whether the users are enterprises or individual consumers, then quality in the sense of reliability will eventually become more important (Cusumano and Yoffie 1999).
In summary, if minimizing defects is important, research and anecdotal evidence suggests that project managers should insist on design and code reviews, continuous builds and testing of the evolving software, and careful documentation of the design (including changes) as well as the code and code changes and fixes. They should make sure teams have debugged features as thoroughly as possible before moving on to the next feature, function, or project milestone—otherwise, apparent progress will not be the real level of progress. Problems or unfinished work will accumulate and delay the project, result in serious quality problems, and eventually prevent shipping or releasing to the customer on time and on budget. Managers should also collect product and process data during and after a project and invest in some sort of formal or informal process review (postmortems) to figure out how to do things better with each project and to define at least some standards and com-mon procedures. These are all good things to do in most if not all projects, but they are essential practices for firms building mission-critical software or enterprise applications and systems. For less critical projects, it is possible to cut some corners, although a software producer should never want to ship products or release systems with quality levels below the customer requirements or what competitors are offering. This is, of course, easier said than done in competitive markets with time and budget constraints.
83.2 Innovation and Design Strategy
In fast-paced markets, there will be many projects where managers stress creativity and invention and encourage design changes even late in a project and at the risk of producing more defects and delaying the ship date or overrunning the budget. The danger is that loosely managed projects will spin out of control—never ship anything because of “infinite defect loops” (the state where every code change, such as to fix a defect, introduces yet another defect—Cusumano and Selby 1995, pp. 256, 333) or because they are too ambitious, with too little time for experimentation. It follows that managers need a strategy for managing the process of innovation and design, so they do not leave too much to chance.
83.2.1 Level of Control
We have seen distinct and deliberate variations in how much control managers introduce into software projects. For example, at the dawn of the mass adoption of the Internet, a company called Netscape, founded in 1994, built a mass-market browser and the enterprise-class servers and tools. Managers at this firm had a specific philosophy of allowing projects to be “slightly out of control” in order to increase speed to market as well as stimulate innovation and creativity (Cusumano and Yoffie 1998). Microsoft, founded two decades earlier in 1975, pioneered personal computer (PC) software and was more orga-nized than Netscape but less structured than IBM, its initial partner in PCs and the market leader in mainframe computing. Microsoft, like Netscape, wanted to be faster to market and more flexible in experimenting with PC software technology than a highly bureaucratic, waterfall-ish IBM process allowed (Cusumano and Selby 1995).
The downside of less control, taken to the extreme, is that too little structure can lead to bad deci-sions, failed projects, and missed ship dates. For example, process problems at Netscape helped Microsoft win the browser wars of the 1990s, just as shipping delays for later versions of Windows encouraged users to adopt Apple computers and other alternatives to Microsoft products in the 2000s. Perhaps for this reason, we see a mixture of different approaches at companies such as Google, where mission-critical systems such as for AdWords or back-office operations have been managed much more tightly than smaller projects, and often in more of a waterfall-ish rather than agile style (Striebeck 2006).
83.2.2 Risk Management
One thing we have learned is that iterative or agile development techniques as we know them today— such as building systems in small increments, with frequent or continuous testing, debugging, integration, and synchronization of work under development—are particularly useful for risk man-agement. These types of techniques provide mechanisms to add visibility and assess progress in a software project as well as make frequent adjustments such as in project scope, priorities, schedules, or staffing.
A form of iterative development called “synch-and-stabilize” (a term coined by myself and Richard Selby in our 1995 book Microsoft Secrets) relies on techniques such as vision statements, evolving specs, development work broken up into subcycles, daily builds, milestone stabilizations, early integration testing, and various customer feedback mechanisms during development (Cusumano and Selby 1995, pp. 187–326). These practices represent a middle approach to risk management between a highly bureau-cratic, waterfall-ish style of software development and a potentially chaotic “hacker” style of develop-ment. For example, the vision statement that kicks off a project is essentially a team contract that should scope out, very clearly, what the team hopes to do and what it is not going to do. Microsoft vision state-ments have been as short as one paragraph or one page, and others extend many pages. Then evolving the product spec from an outline and reevaluating it periodically during the project avoids spending time detailing specs for features that the team will never get to or reject. The Microsoft-style itera-tive process has about a 70% overlap with agile development techniques such as followed in Extreme Programming (XP) projects (Cusumano 2007b).
Also, as part of risk management, project teams should have a “multiversion release” mentality when-ever possible. The idea here is that there is no need to try to create a “perfect” product or system that includes every favorite feature from customers, executives, sales, and marketing or that tries to do the equivalent of rocket science in a commercial setting. If a company is making a software product and it is successful, there is probably going to be a second and a third version. Even in custom development, software and services companies usually have a chance to refine their work with maintenance releases or a phased rollout. So there is rarely the need to attempt so much in any one project that the risk of failure becomes higher than the likelihood of success.
83.2.3 Late Changes
Another important observation that comes from experience, and which has been associated with itera-tive or agile development, is that late design changes can be good; they do not necessarily reflect mistakes or a step backward. Some managers and engineers find this hard to understand, no doubt because of their education and experiences with delayed schedules and buggy code. Many software developers and managers once thought the same way, before experiencing the fast-paced markets for PCs and Internet software and services.
My recollection is that software engineering texts and classes in the 1970s and 1980s used to empha-size two particular ideas. One was that most problems occur because projects do not have a good requirements document and a complete specification before engineers start coding. A second was that late changes in the code or the design are too risky because they can destabilize the product, create more bugs, and make the project later, which then creates a destructive dynamic if the project shortchanges additional testing.
Mainframe software producers such as IBM and Japanese software factories got around this problem by giving tremendous authority to their QA departments. For example, at Hitachi, NEC, Toshiba, and Fujitsu, QA managers had to approve any product ship decision, and most used historical data telling them how many bugs they should be finding in design documents or code at different stages of devel-opment and how much more testing was necessary before they could consider a product to be of high quality (Cusumano 1991, 1993). But new products for new markets or on new platforms (such as mobile
phones compared to mainframes or even PCs) do not have this kind of historical data. Furthermore, fast-paced markets may require different standards and procedures, within certain limits.
The main point is managers and engineers should recognize that any initial specification will be incomplete. Encouraging evolutionary designs through an iterative, agile, or prototype-driven develop-ment process allows a team to respond to unforeseen market changes, user feedback, and competitors’ moves. Late changes may produce more defects and delay the schedule. But, as some research as shown, a coherent set of countermeasures—such as frequent builds, design and code reviews, and integra-tion testing with each change—can mitigate the level of bugs and lateness (MacCormack et al. 2003). A process that expects and allows projects to accommodate change with a minimal impact on quality and productivity is a great competitive advantage in the business of software.
83.2.4 Economies of Scope
Another strategic aspect of innovation and design is how to increase not simply creativity or structure but also economies of scope—efficiency and effectiveness in building multiple products or conducting multiple customer engagements with the same engineering assets. This kind of efficiency and eff ective-ness is almost always important to an ongoing software business. Simple economies of scale are not generally present, except in replicating a software product or in funding in-house R&D or tools devel-opment used by multiple products or projects. Scope economies in general come from reusing system architectures, design frameworks, pieces of working code, support tools, test cases, and historical prod-uct and process data (such as how long particular kinds of projects usually take and how much testing resources they normally require).
Code reuse seems to happen most often when companies package components in ways that are easy for developers in other projects or departments to understand and redeploy, like “black box” parts in the auto industry. If programmers have to change a lot of the design or code to use a part, then reuse can become inefficient. For example, Toshiba, before the days of object-oriented design and class libraries, found that its engineers could change up to 20% of a module and still find the reuse effort cost-effective. If engineers had to change more than that, then they were better off writing new code from scratch. Toshiba also kept track of reused modules and gave out awards to encourage programmers to think about reuse and writing popular modules—an interesting way to channel the energies and creativity of the engineers (Cusumano 1991, pp. 264–265).
Another way to achieve scope economies is to buy or license components or whole products and incorporate them into a new system, including open-source “freeware.” Cheap or free standardized packages or libraries of components often require some trade-offs in functionality. Nonetheless, from a business point of view, surveying what you can buy or get for free rather than build should be part of every organization’s innovation and design strategy.
83.3 Architecture Strategy
An iterative or agile style of development (evolving specs, encouraging lots of feature changes during a project, and doing daily builds and continuous testing) requires a particular type of product or system architecture. With the wrong architecture, too many components and teams will be highly interdepen-dent, leading to difficulties in testing and wasted time when trying to work in parallel.
The best strategy is to encourage a team to think ahead and, from the beginning of the release cycle for a new product or system, devote some engineering effort to designing an architecture that will last at least a few years and accommodate functionality likely to be important in the future. For example, in the early years of development products such as Office and Windows, senior Microsoft managers believed they should devote about 20% of their engineering resources to architectural work and reworking code (Cusumano and Selby 1995, pp. 280–281). They might allocate more for new strategic products or best-selling products that desperately need rearchitecting, though not too much more.
83.3.1 Importance of Modular Designs
Modular designs are important to decouple components and allow at least some coding and testing to proceed independently or in parallel. Modularity also facilitates future design changes and enhance-ments. There are no specific rules on how large or small a software module should be; it depends on the system. Moreover, there is a sliding scale of modularity for almost any complex product. For example, most automobiles have about 15,000 discrete components but automakers design and build their cars using subsystems. For some companies, the number of subsystems is relatively low, such as 25; for oth-ers, it is relatively high, such as 300. The difference is the degree of functionality that each company is designing into the modules or subsystems of its products (Cusumano and Nobeoka 1998, pp. 43–47, 97). However one defines it, a module in software should be some subset of functionality that is smaller than the whole product and hides information from other modules so that programmers can isolate it from other small chunks of functionality (Parnas 1972). Software companies making products, smart-phone apps, or Internet services often think in terms of “features,” which contain modules within some larger subsets of functionality understandable to a user.
The opposite of a modular architecture is an integral architecture, where components are tightly coupled. It is difficult to change and test pieces of a product with an integral architecture without creat-ing problems in dependent components (Baldwin and Clark 2000; Ulrich and Eppcreat-inger 2006). Integral architectures do the equivalent of binding the legs of everybody in a project together—slowing down even fast programmers. Managerial discretion is important here, though, because, in some software sys-tems, an integral architecture is necessary to reduce size or generate superior performance, analogous to a custom-built racing car optimized for speed.
But, as code bases have grown to millions of lines of code for many common products and applica-tions, a modular architecture often becomes essential for development and testing. The architecture needs to lay out what the subsystems of the product are, how the subsystems (collections of modules) relate to each other, perhaps what a module is within a subsystem, and, most important, what the interfaces are so that subsystems and individual modules can exchange data or instructions and work together. Interfaces should be stable for some period of time and not altered without communicating changes carefully to a development team because developers depend on knowing how to get modules to interact with each other.
Modularization also helps a project team prioritize features and build them in order of importance to the product or the business, like a sequential (or “horizontal”) list that the team gets to one by one. With prioritization and modularization, a team usually has the option to cut lower priority features if the project falls behind schedule. If the modules are too interdependent, then a project might need a very large team to build all the desired features in parallel. A smaller team might have to build pieces of the product sequentially. With the sequential process, though, the project will usually have to adopt a water-fall type of schedule and not test the pieces in an integrated fashion until the team is mostly done—and when it may too late to fix major problems or make important changes for the customer.
Again, I can cite examples from my study of Microsoft (Cusumano 2006a, 2007a). This company experienced firsthand the problems of inadequate modularity with the Windows Vista release, which eventually shipped in 2005 after several years of delays and wasted efforts. Microsoft may have adopted a particular strategy in the late 1990s and early 2000s to tie as many functions together into Windows so as not again to appear to violate antitrust law. The Vista (formerly called “Longhorn”) project began with the Windows NT code base. The desktop version of the product was supposed to contain many new fea-tures and quickly grew to more than 50 million lines of “spaghetti” (i.e., nonmodular) code that proved impossible to build daily or test thoroughly and stabilize. The poor state of the code forced Microsoft to abandon years of work on new features, go back to the 2003 Windows server code base, and make some refinements to its engineering and design approach.
Microsoft eventually decided to breakup Windows into different branches and rewrite as much code as possible into smaller, tighter modules. This approach resembled how company engineers had designed
Word, Excel, and PowerPoint and treated these products as “branches” or subsystems of Office, building them separately and then integrating the branches periodically, such as weekly. The new modularization and branching strategy made coding and daily builds for Windows and Office much more manageable than designing and building these as single “monolithic” products. Some teams also used new test-ing tools from Microsoft Research that helped check automatically for a wider variety of errors (code coverage and correctness, application programming interfaces and component architecture breakage, security, problematic component interdependencies, and memory use) and automatically reject code at desktop builds and branch check-in points (Larus et al. 2004).
It is possible to evolve the architecture of a software system incrementally to make it more modular, even if it did not start out that way because of time pressures or simply a lack of foresight and expe-rience. Microsoft, my example again, gradually rearchitected Office over several years to make the applications within Office able to share features. The company formerly sold Office as a collection of packaged “vertical” applications that were really separate products. Each product (mainly Word, Excel, and PowerPoint) had its own separate features for text processing, file management, table creation, cut and paste printer drivers, etc. A separate team figured out how to redesign the products and share at least some of these features across the applications. Within a few years, Office became the product, with Word, Excel, and PowerPoint becoming subsystems that shared about half of their code, at least in some versions. This sharing evolved to the point where, for Office 2000, fully 38% of the develop-ers working on the product were creating common features shared by one or more of the applications (MacCormack 2000).
83.3.2 Long-Term versus Short-Term Trade-Offs
Working to get the architecture right or improving it incrementally is really an investment in the future—making it easier to maintain and enhance a system. Not all software companies or producer organizations have the money and time to make such investments, and users may not want to pay higher prices for such work when the trade-off may be less new features. Many new software companies also have to ship products quickly lest the window of opportunity for their market may disappear.
It is unreasonable to expect new software businesses to devote too much effort to figuring out how to design a product architecture that will last for years. It is not clear what the right number of staff is to allocate to architecture development, especially for a start-up. But to make zero investment in archi-tecture for the future would seem to be a technical and business mistake for any company that hopes to have a future.
83.4 Team Management
Another piece of common wisdom in the software engineering field is that a small team of very tal-ented programmers works much better than a large team of mediocre people, and that talent is more important than experience (Brooks 1975; DeMarco and Lister 1987). Every experienced software man-ager has encountered programmers who can write much more and often much better code than other members of the same team. The rule of thumb given by Tom DeMarco and Timothy Lister, the authors of Peopleware, is that your best programmer will probably be about 10 times better than your worst programmer and about 2.5 times better than your average programmer (DeMarco and Lister 1987, pp. 44–46).
83.4.1 Problem of Large Teams
But one problem with managing by this philosophy is that “super programmers” are hard to find and maybe harder to keep. No rapidly growing company is likely to find enough programmers at the very upper end of the talent level to develop all the software it needs. So the more common problem in
software development is how to get relatively large groups of programmers with varying skills to work together like nimble, efficient small teams.
The synch-and-stabilize techniques, as well as iterative or agile development more broadly, can help managers tackle this problem of how to make large teams work like small teams. Selby and I made this argument in Microsoft Secrets, describing how this company tackled the problem with the follow-ing approaches: project size and scope limits; modular architectures; project architectures mapped to the product (so that everyone knows why they are building what they are building); projects divided into small relatively autonomous teams (three to eight developers per feature team); rigid rules to force coordination and synchronization, such as through daily builds and periodic milestones; good com-munications and shared functional responsibilities; and product and process flexibility to handle real-time learning and the appearance of unpredictable problems (Cusumano and Selby 1995, pp. 409–417; Cusumano 1997).
83.4.2 Teamwork Principles
Many researchers have also found it important to have strong project leaders to make sure that even the top programmers follow a few basic rules to improve teamwork. One is the need for overlapping func-tional responsibilities. In the Microsoft case, for example, product managers take charge of writing vision statements but they are responsible for consulting program managers in order to do this. Program man-agers write functional specs but they have to consult developers, who generally have de facto veto power because they have to estimate the time and people required to write the code. Developers and testers are paired and jointly responsible for testing code. Good communications and overlapping responsibilities help an organization avoid becoming too functionally oriented, bureaucratic, and compartmentalized, with large separate groups that simply hand off work to each other.
In general, software product companies tend to organize separate groups for each product and then smaller feature teams within these product units. Managers can also scale up this type of structure by creating more product units and more feature teams, as long as the product architectures allow teams to proceed more or less in parallel. Within a product unit that has an effective leader, the right set of devel-opment techniques, and a modular product architecture, a company can have a large team of several hundred people or more working together almost like one nimble small team.
Companies such as IBM, Accenture, or Infosys that design or enhance custom systems also rely on small teams to build small chunks of functionality, but their structures are often far more com-plex. They generally organize at the company level in a matrix, with some managers and personnel assigned to industry or “vertical” specializations, such as manufacturing or banking sectors, and oth-ers in a variety of “horizontal” functions and specializations that cut across industry domains, such as experts in SAP, Oracle, Microsoft, or open-source systems. Bespoke projects generally have some team members, such as for the requirements phase or final customer acceptance testing, located at the customer site, while much of the actual software development takes place elsewhere (Carmel and Tjia 2005; Cusumano 2008).
83.5 Project Management
It is worth repeating that the traditional waterfall model, though it may sometimes deliver software on time and meet customer requirements relatively closely and with few bugs, is not a good process for fast-paced markets driven by the need to adapt to continuous innovation, uncertainly in customer require-ments, and unpredictable competition. The waterfall model originally came about in fairly stable but complex development projects, like rocket systems, where NASA needed to control requirements and schedules in great detail (Royce 1970). To NASA and contractors such as Lockheed, not making changes that might create bugs has been far more important than being innovative or fast to market. Most commercial software producers, however, whether they make mass-market products or custom-built
systems, need a process that lets them evolve designs and incorporate customer feedback in real time, during a project. For these kinds of environments, projects should do some preliminary planning but then more detailed requirements specification, program design, coding, and testing as concurrently as possible and with as much customer involvement as possible.
83.5.1 Divide and Conquer
Tackling any complex task brings up the age-old principle of “divide and conquer.” In software, this means that managers should break large projects into multiple subprojects (many firms use the term subcycles or milestones) of no more than a few days, weeks, or months duration. It is much easier to manage several small groups that are doing a focused, small amount of work and that have a deadline not too far in the future than manage a large group building a lot of features scheduled for completion in months or years. Too many things can go wrong or change when a project deadline is too far into the future, when a team is too large, or when the amount of code that needs to be integrated is too extensive.
83.5.2 Individual and Project Discipline
It is also important to get commitments from people to work as a team and deliver on individual prom-ises. Programmers and testers should schedule their own work, rather than have managers dictate schedules. It is often not necessary to press programmers in product companies working in highly com-petitive markets to shorten their estimates because they tend to be overly optimistic about what they can do. Self-scheduling by developers, therefore, produces aggressive schedules, which managers like, as well as fair schedules because they come from the bottom-up. But historical project data are still useful so that managers can evaluate the realism of individual estimates and schedule some buffer time into a project to accommodate misjudgments as well as unforeseen changes or problems that turn out to be more difficult than anticipated.
83.5.3 Infrastructure Investments
Software producers generally invest heavily in various tools and infrastructure such as build teams or process experts. For example, it is important to make checking in easier and faster for programmers and to automate as much testing as possible. If check-in times take too long, then the frequent build process becomes burdensome and programmers will avoid it. But the real benefit of a smoothly work-ing build process is that, again, a few simple rules can be enough to have discipline and still be subtle. Programmers do not like to rewrite their code.
83.6 Quality Assurance
Finally, we come to the topic of testing and quality assurance. This is an entire subject in itself, as well as essential for any software-producing organization to be successful.
83.6.1 Building in Quality
In automobiles and other industries, we learned from Japanese companies decades ago that it is cheaper ultimately to “build in” quality continuously rather than to test and fix product flaws at the end of a development cycle or production process. This has been especially true in waterfall-style software development projects, where it has long been recognized that fixing a bug at a customer site can cost perhaps a hundred times more than finding and fixing the bug early in a project (Boehm 1976, 1981). Software products with modular architectures and built with iterative or agile techniques appear able to make design changes and fix problems late in a project more easily and with much
lower costs than traditional waterfall processes. But fixing problems in the field such as by sending out patches is still expensive and can be harmful to a company’s reputation.
Another point here is the importance not only of continuous feature testing—done manually by tes-ters or through automated tests—but continuous integration and system-level testing. For example, cre-ating a better drawing feature is fine. But if the user cannot print the object then the new feature is not properly integrated and the product has a bug. It is better to identify these problems earlier rather than later. Data from process surveys have reinforced this observation—the earlier a project can do integra-tion testing, and the more it does integraintegra-tion testing, the higher the quality and the more likely the project will be closer to the schedule and budget targets (MacCormack et al. 2003).
It is also important for projects to automate as much feature or unit testing as well as component- integration and system-level testing as possible. Automation makes it possible to rerun tests frequently, such as with each code change, and find bugs generated by those changes. However, it is a myth that auto-mation significantly reduces the need for people. The reason is that organizations always need people to update the tests as a project moves forward and incorporate more functionality or changes in user interface designs. Most automated tests run off the user interface and have to change as the user interface changes.
83.6.2 You Can Never Have Too Many Testers
In general, software managers would do well to adopt the philosophy that they can never do too much testing. This is especially true for a mass-market software products company. But it is also true for any mission-critical or enterprise-class software system. And, increasingly, consumers are expecting fl aw-less functionality even in inexpensive or free software or functionality accessed over the Internet.
Many people are surprised to learn how many people Microsoft started to allocate to testing in the 1990s—as many as it does to programming, with testers usually assigned as “buddies” to developers in a one-to-one ratio (Cusumano and Selby 1995; Cusumano 2004). Some people are also surprised that, given this enormous investment in testing, Microsoft’s quality is not higher. It is important to under-stand these two observations. First, Microsoft’s quality has improved dramatically over time and, in multiple ways, reduced bugs but also products that are far more complex yet much easier to install and use compared to the old MS-DOS or early Windows systems and applications. These results directly reflect the enormous investment in testing as well as in process and product improvement more broadly. Second, many of Microsoft’s testers, especially in the applications groups, are more like an advance army of beta users. These testers try to use a new product or version under development as a user would and try to detect user types of problems early on. It is a good investment to make because Microsoft sells tens or hundreds of millions of copies of its most popular software products. A few bugs or products that are difficult to install and use can generate millions of customer complaints.
83.6.3 Continuous Process and Product Improvement
Over multiple projects, it is desirable to have a strategy to improve process and product quality on a con-tinuous basis—the now-familiar Japanese notion of kaizen. One way is to conduct postmortem analyses and share the results with the team and then act on the conclusions. For example, in the late 1980s and early 1990s, Microsoft adopted a practice of creating postmortem reports. These generally had three parts to them, compiled by the managers for each function on each project—product management, pro-gram management, development, testing, customer support, and user education (documentation). Each manager was supposed to interview his or her team members and come up with a summary of (1) what went well on the project, (2) what went poorly, and (3) what should they do differently the next time. Most of the Microsoft teams stayed together for a few years and had an opportunity to apply what they learned in a subsequent project (Cusumano and Selby 1995, pp. 331–339).
Another important source of learning is data from customers through the product support organiza-tion. With data-collection tools, it is possible to create detailed lists of bugs and fixes. Good teams will
also develop heuristics about how to avoid and fix common bugs. They will create checklists or hand-books or run training sessions for their testers and developers to help avoid common errors.
A phrase common at Microsoft in the 1990s and at other companies such as Netscape and Google— “eat your own dog food”—captures yet another useful practice. That is to use the product internally as you are building it so that you get a firsthand experience of whether or not it is any good. For example, if you are building the next version of Windows, as soon as the team gets to a point where basic func-tionality works, a developer will start using it to do basic things, like saving files and running the e-mail program. If the product is lousy and crashes, then the developer has to eat the programming equivalent of dog food.
Microsoft and other software product companies have also introduced a variety of mechanisms to get customer feedback during development. Early beta releases can provide important feedback on the quality as well as the design of a product from actual customers. Betas that come too late in the develop-ment cycle do not allow the team enough time to make major design changes. Another technique is the usability lab, where companies bring in people to test features or user interfaces under development. In the Microsoft case, programmers watch from behind one-way mirrors to see what percentage of users struggle to understand their new features. Microsoft also has sent developers and testers to staff customer support lines after a new product ships so that, again, they can get firsthand feedback on cus-tomer reactions. Product teams complement usability lab data and cuscus-tomer support data (summaries of which teams received on a weekly basis) with customer satisfaction surveys, product usage surveys, and other feedback mechanisms.
Another good practice to monitor and improve the operations of a software development organiza-tion is to have each project track a small number of quantifiable metrics covering product quality, the size and performance of the product, and the development process. It is important to understand the major factors driving performance of teams and customer responses to products. Project managers need to be able to measure these factors quantitatively if they are to manage them effectively. This idea of sta-tistical analysis and feedback has also been central to the SEI philosophy, especially for projects aiming to reach CMM Level 5 (Humphrey 1989).
83.7 Results from Project Surveys and Conclusions
During 2001–2003, several colleagues and I became curious about how widespread iterative develop-ment techniques were becoming around the world, in contrast to more waterfall-ish approaches, and what, if any, measurable impact they were having on project output measures like quality, productivity, and scheduling. We conducted a pilot study at Hewlett-Packard (HP) and Agilent and then followed this with a survey of 104 projects from a variety of major software producers around the world (Cusumano et al. 2003).
83.7.1 Global Differences in Practices
In the global survey, we first asked about conventional best practices. For example, how many projects wrote architectural and functional specifications as well as detailed designs before coding? How many used code-generation tools? How many implemented design and code reviews? Then we asked about the newer techniques: How many projects divided up into subcycles or milestones, used beta tests, paired programmers with each other and with testers, followed daily builds, and did regression tests on each build? We also divided the sample into regions: India, Japan, the United States, Europe, and others.
About 85% of the sampled projects wrote functional specs, and nearly 70% wrote architectural and detailed design documents, rather than just writing code with minimal planning and documentation. These conventional good practices were especially popular in India, Japan, and Europe. The major difference was that few US projects wrote detailed designs. (I had observed this practice earlier at Microsoft, where projects in general did not write detailed designs but went straight from a functional
specification to coding in order to save time and not waste effort writing specs for features that teams might later delete.) Code generation (a technique that uses special software programs to generate code from design frameworks or design tools) was most popular in the Indian sample. Design and code reviews require particular process discipline, as promoted in the SEI recommendations. Not surpris-ingly, all the Indian and Japanese projects did design reviews, and all but one of the Indian projects did code reviews as well. Most projects in the other regions also followed these good practices, though not universally.
As for the iterative techniques, these were by now popular around the world, but with some varia-tions. Most projects used subcycles, for example, though these were most common in our Indian and European and other samples and least popular in Japan. Projects that did not use subcycles, in our definition, followed a conventional waterfall process. More than half the Japanese projects, there-fore, seemed to follow a conventional waterfall schedule. Most projects also used beta releases, which had become a useful testing and feedback tool since the widespread use of the Internet in the mid-1990s. Over 40% of the projects surveyed paired testers with developers—a Microsoft-style practice especially popular in India. Thirty-five percent of the projects used the XP-practice of pairing pro-grammers, and, again, this was especially popular in our Indian sample (58%). More than 80% of the sample used daily builds at some time during the project and about 46% used daily builds at the beginning or middle, which is closer to the Microsoft style of development. More than 83% of the projects also ran regression tests on each build. Again, this good practice was most common in the Indian sample (nearly 92%).
83.7.2 Links between Practices and Performance Metrics
Researchers on software engineering over the past two decades will not be surprised with two of our findings from analyzing the HP and Agilent survey (MacCormack et al. 2003). This set of projects was roughly comparable in techniques and metrics, compared to the global sample, which had a wider vari-ety of projects.
First, the HP and Agilent developers tended to be more productive in terms of code output when they had a more complete design before starting to write code. Second, more complete designs before coding correlated with lower levels of bugs. These results make sense and have led many software managers to insist on having complete specs before programmers start writing code—the old waterfall process. It is logical that programmers can be more productive in a technical sense if they make fewer changes during a project and thus have less rework to do. They also have less chance of introducing errors if they make fewer design and code changes.
In a business sense, however, locking a project early into a particular design may not produce the best result for the customer or enable the firm to compete effectively in a rapidly changing market. We did, in fact, find some evidence that HP and Agilent managers thought their customers were more satisfied with designs that evolved during a project. We also found that use of early betas and prototypes— opportunities for customers to provide early feedback on the design—was associated with higher code productivity and fewer defects, probably because the HP and Agilent projects were able to make early adjustments. In addition, running regression tests with each build, breaking projects into multiple sub-cycles, and conducting design reviews were associated with fewer bugs.
The most important conclusion from the HP and Agilent data is that, at least within a single develop-ment culture, iterative techniques such as described by the synch-and-stabilize philosophy appear to form a coherent, effective set of practices. Software projects can be more flexible in the sense of accom-modating design changes with minimal impact on quality and productivity when they use several of these techniques together, rather than just selecting one or two. Our results suggest that there are trade-offs associated with using different techniques, especially with regard to allowing specifications to evolve after the start of coding. Use of particular techniques, however, helps projects overcome these potential trade-offs.
In short, when the HP and Agilent projects used beta releases to get early user feedback, conduct design reviews, and run regression tests on each build of the code (i.e., after each change or addition of new code), then the correlation between having an incomplete design when coding starts and high levels of bugs disappeared. It seems, then, that software producers can have the best of both worlds. With the right set of techniques, they can write high-quality code in a productive manner and quickly adapt to customer feedback and changes in the marketplace during rather than after a project, as in the old waterfall style.
83.7.3 Regional Differences in Performance
Project performance across firms is difficult to measure and even more difficult to compare regionally from such a small sample, but we used some crude measures and found some noteworthy differences in our global sample. Based on the data we collected, the Japanese had the best quality levels (median of 0.005) in terms of defects reported per 1000 lines of code in the 12 months after implementation at customer sites. The Indian projects (0.033) and US projects (0.030) were quite similar and very good by historical standards but still six times more “buggy” than the Japanese projects. Projects from Europe and other areas (0.05) fell in between the Japanese and the US and Indian levels. In terms of lines of code delivered per programmer per month—a measure of programmer output but not really productivity— the Japanese ranked at the top. They had a median output level of about 469 lines of code per program-mer per month, unadjusted for programming language or type of project. This was more than twice the level of the Indian projects and about 70% more than the US projects, though only slightly higher than the European and other projects.
Early research on Japanese software practices found few defects along with high levels of nominal code productivity and reuse, so these results are not surprising (Zelkowitz et al. 1984; Cusumano and Kemerer 1990). US programmers often have different objectives and development styles. They tend to emphasize shorter or more innovative programs and spend much more time thinking about what they are writing and in optimizing code—which reduces lines of code productivity in a gross sense. The Indian companies have mainly US clients and seem to have adopted a US-type programming style, which tends to view shorter programs as better than longer programs. Nonetheless, we expected bug levels in India to be similar to the Japanese, given the emphasis of the Indian companies on achieving high SEI levels.
Overall, the data suggested some technical strengths in India and Japan with regard to software devel-opment but ongoing weaknesses from a business perspective. As is common in this type of research, due to the extreme variations in performance from project to project, it is hard to draw any definite conclusions. But it is important to remember as well that no Indian or Japanese company has made any real global mark in software innovation or establishing globally adopted technology platforms, which have long been the province of US and a few European firms. Code productivity is by no means a good measure of business performance and is less valuable than quality numbers in judging a software devel-opment organization. Japanese companies still seemed preoccupied with producing close to zero-defect code, and one can only wonder how much this practice constrains their willingness to experiment and innovate in software development. The global survey data also suggest that Indian companies are doing an admirable job of combining conventional best practices with iterative techniques. But they tend to treat software as a custom service business and, like the Japanese, lag behind the United States and some European firms in establishing global platforms for software products (Cusumano 2005, 2006b).
References
Boehm, B. W. 1976. Software engineering. IEEE Transactions on Computers C-25(12): 1226–1241. Boehm, B. W. 1981. Software Engineering Economics. Englewood Cliffs, NJ: Prentice Hall.
Boehm, B. W. 1988. A spiral model of software development and enhancement. Computer May: 61–72. AQ2
Brooks, Jr., F. P. 1975. The Mythical Man-Month: Essays on Software Engineering. Reading, MA: Addison-Wesley.
Carmel, E. and P. Tjia. 2005. Offshoring Information Technology. Cambridge, U.K.: Cambridge University Press. Cusumano, M. A. 1991. Japan’s Software Factories. New York: Oxford University Press.
Cusumano, M. A. 1993. Objectives and context of software measurement, analysis, and control. In D. Rombach et al. (eds.), Experimental Software Engineering Issues: Critical Assessment and Future Directions, Lecture notes in computer science 706. London, U.K.: Springer-Verlag.
Cusumano, M. A. 1997. How Microsoft makes large teams work like small teams. MIT Sloan Management Review 39(1): 9–20.
Cusumano, M. A. et al. 2003. Software development worldwide: The state of the practice. IEEE Software
20(6): 28–34.
Cusumano, M. A. 2004. The Business of Software. New York: Free Press.
Cusumano, M. A. 2005. The puzzle of Japanese software. Communications of the ACM 48(7): 25–27. Cusumano, M. A. 2006a. What road ahead for Microsoft and Windows. Communications of the ACM
49(7): 21–23.
Cusumano, M. A. 2006b. Envisioning the future of India’s software services business. Communications of the ACM 49(10): 15–17.
Cusumano, M. A. 2007a. What road ahead for Microsoft the company? Communications of the ACM
50(2): 15–18.
Cusumano, M. A. 2007b. Extreme programming compared with Microsoft-style iterative development.
Communications of the ACM 50(10): 15–18.
Cusumano, M. A. 2008. Managing software development in globally distributed teams. Communications of the ACM 51(2): 15–17.
Cusumano, M. A. et al. 2009. Critical decisions in software development: Updating the state of the prac-tice. IEEE Software 26(5): 84–87.
Cusumano, M. A. 2010. Staying Power. Oxford, U.K.: Oxford University Press.
Cusumano, M. A. and C. F. Kemerer. 1990. A quantitative analysis of U.S. and Japanese practice and per-formance in software development. Management Science 36(11): 1384–1406.
Cusumano, M. A. and K. Nobeoka. 1998. Thinking Beyond Lean: How Multi-Project Management in Transforming Product Development at Toyota and Other Companies. New York: Free Press.
Cusumano, M. A. and R. W. Selby. 1995. Microsoft Secrets. New York: Free Press.
Cusumano, M. A. and D. B. Yoffie. 1998. Competing on Internet Time: Lessons from Netscape and Its Battle with Microsoft. New York: Free Press.
Cusumano, M. A. and D. B. Yoffie. 1999. Software development on Internet time. IEEE Computer, Special issue on Software Engineering & Management, October: 2–11.
DeMarco, T. and T. Lister. 1987. Peopleware: Productive Projects and Teams. New York: Dorset.
Fagan, M. E. 1976. Design and code inspections to reduce errors in program development. IBM Systems Journal 15(3): 182–211.
Gilb, T. 1988. Principles of Software Engineering Management. Wokingham, England: Addison-Wesley. Humphrey, W. S. 1989. Managing the Software Process. Reading, MA: Addison-Wesley.
Larman, C. and V. R. Basili. 2003. Iterative and incremental development: A brief history. IEEE Computer
36(6): 2–11.
Larus, J. R. et al. 2004. Righting software. IEEE Software 21(3): 92–100.
MacCormack, A. 2000. Microsoft Office 2000. Boston, MA: Harvard Business School, Multimedia Case #9-600-023.
MacCormack et al. 2003. Trade-offs between productivity and quality in selecting software development practices. IEEE Software 20(5): 78–85.
McConnell, S. 1996. Rapid Development. Redmond, WA: Microsoft Press.
Parnas, D. L. 1972. On the criteria to be used in decomposing systems into modules. Communications of the ACM 5(12): 1053–1058.
Royce, W. 1970. Managing the development of large software systems. Proceedings of IEEE WESCON 26 (August): 1–9.
Striebeck, M. 2006. Ssh! We are adding a process. Proceedings of Agile 2006 Conference. Los Alamitos, CA: IEEE Computer Society, pp. 1–8.
Ulrich, K. T. and S. D. Eppinger. 2006. Product Design and Development. New York: McGraw-Hill. Zelkowitz, M. et al. 1984. Software engineering practice in the U.S. and Japan. IEEE Computer 17(6):
57–66.
Author Queries
[AQ1] Reference citation Baldwin and Clark (2000) is cited in the text but not provided in reference list. Please check.
[AQ2] Please check if edit to sentence starting “They had a…” is okay. [AQ3] Please provide page range for Cusumano 1993.