estimating Software costs - GAO Cost Estimating GAO. Best Practices for Developing and Managing

chAPter 12

Software is a key component in almost all major systems the federal government acquires. Estimating software development, however, can be difficult and complex. To illustrate, consider some statistics: a Standish Group International 2000 report showed that 31 percent of software programs were canceled, more than 50 percent overran original cost estimates by almost 90 percent, and schedule delays averaged almost 240 percent.43_{Moreover, the Standish Group reported that the number of software development} projects that are completed successfully on time and on budget, with all features and functions as originally specified, rose only from 16 percent in 1994 to 28 percent in 2000. 44

Most often, creating an estimate based on an unachievable schedule causes software cost estimates to be far off target. Playing into this problem is an overwhelming optimism about how quickly software can be developed. This optimism stems from a lack of understanding of how staffing, schedule, software complexity, and technology all interrelate. Furthermore, optimism about how much savings new technology can offer and the amount of reuse that can be leveraged from existing programs also cause software estimates to be underestimated. Case study 37 gives an example.

Case Study 37: Underestimating Software, from Space Acquisitions,

GAO-07-96

The original estimate for the Space Based Infrared System for nonrecurring engineering, based on actual experience in legacy sensor development and assumed software reuse, was significantly underestimated. Nonrecurring costs should have been two to three times higher, according to historical data and independent cost estimators. Program officials also planned on savings from simply rehosting existing legacy software, but those savings were not realized because all the software was eventually rewritten. It took 2 years longer than planned to complete the first increment of software.

GAO, Space Acquisitions: DOD Needs to Take More Action to Address Unrealistic Initial Cost Estimates of Space Systems, GAO-07-96 (Washington, D.C.: Nov. 17, 2006).

Our work has also shown that the ability of government program offices to estimate software costs and develop critical software is often immature. Therefore, we highlight software estimation as a special case of cost estimation because of its significance and complexity in acquiring major systems. This chapter 43_{Daniel D. Galorath,}_{Software Projects on Time and within Budget—Galorath: The Power of Parametrics}_{, PowerPoint presentation, El Segundo,}

California, n.d., p. 3. http://www.galorath.com/wp/software-project-failure-costs-billions-better-estimation-planning-can-help.php.

44_{Jim Johnson and others, “Collaboration: Development and Management—Collaborating on Project Success,”}_{Software Magazine}_,

supplements the steps in cost estimating with what is unique in the software development environment, so that auditors can better understand the factors that can lead to software cost overruns and failure to deliver required functionality on time. Auditors should remember that all the steps of cost estimating have to be performed for software just as they have to be performed for hardware.

The 12 steps of cost estimating described in chapter 1 and summarized in table 15 also apply to software. That is, the purpose of the estimate and the estimating plan should be defined in steps 1 and 2, software requirements should be defined in step 3, the effort to develop the software should be defined in step 4, GR&As should be established in step 5, relevant technical and cost data should be collected in step 6, and a method for estimating the cost for software development and maintenance should be part of the point estimate in step 7. Moreover, sensitivity analysis in step 8, risk and uncertainty analysis in step 9, documenting the estimate in step 10, presenting results to management in step 11, and updating estimates with actual costs in step 12 are all relevant for software cost estimates.

Table 15: The Twelve Steps of High-Quality Cost Estimating Summarized

Step Summary

1 Define the estimate’s purpose 2 Develop the estimating plan

3 Define the program characteristics, the technical baseline 4 Determine the estimating structure, the WBS

5 Identify ground rules and assumptions 6 Obtain the data

7 Develop the point estimate and compare it to an independent cost estimate 8 Conduct sensitivity analysis

9 Conduct a risk and uncertainty analysis 10 Document the estimate

11 Present the estimate to management for approval 12 Update the estimate to reflect actual costs and changes Source: GAO.

In this chapter, we discuss some of the best practices for developing reliable and credible software cost estimates and fully understanding typical cost drivers and risk elements associated with software development.

u

nique

C

omponentsof

s

oftwAre

e

stimAtion

Since software is not tangible like hardware, it can be more ambiguous and difficult to comprehend. In addition, software is built only once, whereas hardware is often mass produced, once design and testing are complete. Unlike hardware, for which the industry changes more slowly, software changes constantly, making it difficult to collect good data for cost estimating. Despite these differences, software estimating is otherwise similar to hardware estimating in that it follows the same basic development process.45_For instance, both use the same types of estimating methods—analogy, engineering build-up, parametric. 45_{A source for more information on hardware cost estimating is the International Society of Parametric Analysts,}_Parametric Estimating Handbook, 4th ed.

Size and complexity are cost drivers for both. Finally, how quickly hardware and software can be produced depends on the developer’s capability, available resources, and familiarity with the environment.

Software is mainly labor intensive, and all the tasks associated with developing it are nonrecurring—there is no production phase. That is, once the software is developed, it is simple to produce a copy of it. How much effort is required to develop software depends on its size and complexity. Thus, estimating software costs has two basic elements—the software to be developed and the development effort to accomplish it.

e

stimAting

s

oftwAre

s

ize

Cost estimators begin a software estimate by predicting the sizes of the deliverables that must be

constructed. Software sizing is the process of determining how big the application being developed will be. The size depends on many factors. For example, software programs that are more complex, perform many functions, have safety-of-life requirements, and require high reliability are typically bigger than simpler programs.

Estimating software size is not easy and depends on having a detailed knowledge about a program’s functions in terms of scope, complexity, and interactions. Not only is it hard to generate a size estimate for an application that has not yet been developed, but the software process also often experiences

requirements growth and scope creep that can significantly affect size and the resulting cost and schedule estimates.

Programs that do not track and control these trends typically overrun their costs and experience schedule delays. Methods for measuring size data include COSMIC (Common Software Measurement International Consortium) Functional Sizing Method, function point analysis, object point analysis, source lines of code, and use case (described in table 16).

Table 16: Sizing Metrics and Commonly Associated Issues

Metric Advantages Disadvantages COSMIC functional sizing

Measures the size of software based on functional user requirements; sizes software independently of the technology to be used to implement it, focusing on practices and procedures the software must follow to meet user needs. COSMIC points are based on four different data movements: entry, exit, read, and write. Each one constitutes a COSMIC function point.

The method can be used to determine the software size of various applications including business, real-time

(telecommunications, process control), embedded software (cellular phones, electronics), and infrastructure software (operating system software)

Sizing is easily understood and simplified because all data movements have the same value; sizing does not depend on data attributes;

It applies to real-time and embedded systems and allows for end-user and developer viewpoints; standards exist for counting

Recently developed, so benchmarking data are limited; not accurate for counting highly algorithmic software; detailed information about data movements takes time to collect; automated counting does not exist

Metric Advantages Disadvantages Function point analysis

Considers how many functions a program does rather than how many instructions it contains; functions typically include user inputs (add, change, delete), outputs (reports), data files to be updated by the application, interfaces with other applications, and inquiries (searches or retrievals).

Each function is weighted for complexity and total count is adjusted for the effect of 14 characteristics such as data communications, transaction rate, installation ease, and whether there are multiple sites. Accurate counting requires in-depth knowledge of standards,

experience, and, preferably, function point certification. Function point analysis is linked directly to system requirements and functionality, so size analysis is measured in terms users can understand. The size estimates (and resulting cost and schedule estimates) can be based on quantifiable analysis through the project life cycle as requirements change. Function points are particularly useful in many development environments that might use unified modeling language, commercial off-the- shelf components, or object-oriented approaches to software development and implementation

Many types of data sources can be used throughout development: user or estimator interviews, requirements and design documents, data dictionaries and models, end user guides, screen captures; not dependent on language or technology; count is unaffected by language or tools used to develop the software; counts are available early in development from requirements and design specifications; nontechnical users can understand what function points are measuring; function points can be used to determine requirements creep; counts are fully documented and auditable;

standards are established and reviewed often by the International Function Point Users Group; counting can be quick and efficient

Counting involves subjectivity; difficult to derive requirements from top-level specifications; does not capture technical and design constraints; untrained or inexperienced people can develop inconsistent function point counts; definitions can be confusing; automated function point analysis counting does not exist; database is not as big as for source line of code counts; counts tend to underestimate algorithmic intensive systems

Object point analysis

Uses integrated computer-aided software engineering tools (CASE) to count number of screens, reports, and third-generation modules for basic sizing; CASE tools take over the job of manually writing software code by using graphical user interface generators, libraries of reusable components, and other design tools. Object points focus on actors involved in the solution and any actions they must take. One benefit of using objects (i.e., actors) is that similar behaviors can be grouped into classes, allowing for behaviors from upper classes (parent) to be inherited by lower classes (children). Inheritance results in reduced coding effort; each count is weighted for complexity, summed to a total count, and adjusted for reuse

Relies on a graphical user interface; automates manual activities; objective measures; easier calculations; accounts for reuse through inheritance

Counts occur at the end of design; no standards for counting; and not widely used and therefore validated productivity metrics are not available

Metric Advantages Disadvantages Reports, interfaces, conversions, extensions, and forms/workflows (RICEF/W)

Commonly used to size the effort associated with implementing Enterprise Resource Planning (ERP) systems; identifies changes that need to be made to configure the ERP system so that it satisfies user needs and fits within the target operating environment. Can be used to add functionality through custom development. RICEF/W needs to be adjusted for complexity

Represents ERP modifications and enhancements that do not require custom development

Specific to ERP systems; no standards for counting; does not capture costs for integrating bolt-on functionality

Source lines of code (SLOC)

Considers the volume of code required to develop the software; includes executable instructions and data declarations and normally excludes comments and blanks. Estimation is by analogy, engineering expertise, or automated code counters. SLOC sizing is particularly appropriate for projects preceded by similar ones (e.g., same language, developers, type of application); helps ensure that experience is aligned to future development. When developing lines of code counts, it is critical to define what is and is not included. When developing databases or relying on software cost models, consistency in defining what the lines of code include is key

Widely used for many years; can be used to estimate real time systems easily counted, manually or by automated code counter; objective; large databases of historical program sizes are available; can obtain precise counts of existing software using the USC Code Counter

No standard definition of what should be counted as lines of code (e.g., physical line vs. logical statement); different lines of code count for the same function, depending on language and programmer’s style; hard to capture lines of code for commercial off-the- shelf systems; hard to translate lines of code counts between other programming languages such as object oriented code; variations in definition make it hard to compare studies using SLOC; hard to estimate program SLOC early; emphasizes coding effort, which is small compared to overall software development effort

Use cases and use case points

Defines interactions between external users and the system to achieve a goal (e.g., capture fingerprint or facial biometric to enroll applicants). A use case model describes a system’s functional requirements, consists of all users and use cases (tasks performed by the end user of a system that has a useful outcome), and identifies reuse by use case inclusions and extensions. Sizing count is arrived at by categorizing use cases as small, medium, or large and applying an average “use case points per category.” Adding a complexity factor to the sizing count based on number and types of users and transactions improves the count accuracy

Applies to interactive end- user applications and devices users interact with; intuitive to stakeholders and development team; identifies opportunities for software reuse; traceable to development team’s plans and output; increasingly applied to real-time systems; can be mapped to test cases and business scenarios, which helps in staggered deployment

Often yields an inaccurate final estimate if the system engineering process is immature and historical data are lacking; no standards for counting; developer must be using object oriented design techniques so required documentation is available; estimate cannot be done until design document with the defined use case is available; requires a design team with a great deal of experience with object oriented design Source: DOD, NASA, SCEA, and industry.

While software sizing can be approached in many ways, none are accurate because the “size” of software is an abstract concept. Moreover, with the exception of COSMIC and function points, none of the methods table 16 describes has a controlling body for internationally standardizing the counting rules. In the absence of a universal counting convention, different places may take one of the source definitions for the basic approach and then “standardize” the rules internally. This can result in different counts. Therefore, it is critical that the sizing method used is consistent. The test of a good sizing method is that two separate individuals can apply the same rules to the same problem and yield almost the same result. Before

choosing a sizing approach, one must consider the following questions of maturity and applicability: Are the rules for the sizing technique rigorously defined in a widely accepted format?

■

Are they under the control of a recognized, independent controlling body? ■

Are they updated from time to time by the recognized, independent controlling body? ■

Does the controlling body certify the competency (and, hence, consistency) of counters who use ■

their rules?

Are statistical data available to support claims for the consistency of counting by certified counters? ■

How long have the rules been stable? ■

Auditors should know a few things about software sizing. The first is that reused and autogenerated software source lines of code should be differentiated from the total count. Reused software (code used verbatim with no modifications), adapted software (code that needs to be redesigned, may need to be converted, and may need some code added), and autogenerated software provide the developer with code that can be used in a new program, but none of these comes for free, and additional effort is usually associated with incorporating them into a new program. For instance, the effort associated with reused code depends on whether significant integration, reverse engineering, and additional design, validation, and testing are required. But if the effort to incorporate reused software is too great, it may be cheaper to write the code from scratch. As a result, the size of the software should reflect the amount of effort expected with incorporating code from another source. This can be accomplished by calculating the equivalent source lines of code, which adjusts the software size count to reflect the fact that some effort is required.

Software porting is a special case of software reuse that is getting increasing visibility in cost estimation with respect to specific technologies, such as communications systems (waveforms). Porting represents hidden pitfalls, depending on the amount of capability to be transferred from special purpose processors (such as field-programmable gate arrays). Also, the quality of software commenting and documentation and the modularity of the initial code’s design and implementation greatly affect the porting of standard code in general purpose processors. Therefore, assumptions regarding savings (for example, assume less effort is required and no testing is necessary) from reused, adapted, and autogenerated software code should be looked at skeptically because of the additional work to research the code and provide necessary quality checks. As a minimum, regression testing will be required before integrating the software with the hardware for this type of code.

Second, while function points generate counts for real-time software, like missile systems, they are not optimal in capturing the complexity associated with high levels of algorithmic software. Therefore, for

programs that require high levels of complex processing like operating systems, telephone switching systems, navigation systems, and process control systems, estimators should base the count on COSMIC points or SLOC rather than function points to adequately capture the additional effort associated with developing algorithmic software.

Finally, choosing a sizing metric depends on the software application (purpose of the software and level of reliability needed) and the information that is available. Since no one way is best, cost estimators should work with software engineers to determine which metric is most appropriate. Since SLOCs have been used widely for years as a software sizing metric, many organizations have databases of historical SLOC counts for various completed programs. Thus, source lines of code tend to be the most predominant method for sizing software. If the decision is made to use historical source lines of code for estimating

In document GAO Cost Estimating GAO. Best Practices for Developing and Managing Capital Program Costs. Applied Research and Methods (Page 143-158)