A TOOL FOR SOFTWARE PROJECT MANAGEMENT FOR ESTIMATION, PLANNING & TRACKING AND CALIBRATION

(1)

TOOL

FOR

SOFTWARE PROJECT MANAGEMENT

FOR ESTIMATION, PLANNING & TRACKING

AND

CALIBRATION

DISSERTATION

SUBMITTED IN PARTIAL FULLFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF

MASTER OF TECHNOLOGY IN INFORMATION TECHNOLOGY, (SOFTWARE ENGINEERING)

INDIAN INSTITUTE OF INFORMATION TECHNOLOGY (DEEMED UNIVERSITY)

DEOGHAT, JHALWA ALLAHABAD – 211011 (U.P.)

Submitted By

Nilesh Chandra Shukla

MS200511 IIIT-Allahabad Under the Guidance of

Dr. Ratna Sanyal

Coordinator, UDL IIIT-Allahabad

(2)

INDIAN INSTITUTE OF INFORMATION TECHNOLOGY

Allahabad

(Deemed University)

(A Center of Excellence in Information Technology Established by Govt. of India)

DATE:___________________________

I DO HEREBY RECOMMEND THAT THIS THESIS PREPARED UNDER

MY SUPERVISION BY

NILESH CHANDRA SHUKLA

ENTITLED

A TOOL FOR SOFTWARE PROJECT MANAGEMENT FOR ESTIMATION,

PLANNING & TRACKING AND CALIBRATION

BE ACCEPTED IN

PARTIAL FULFILLMENT OF THE REQUIREMENT OF THE

COMPLETION OF MASTER OF TECHNOLOGY IN INFORMATION

TECHNOLOGY (SOFTWARE ENGINEERING) PROGRAM FOR

EXAMINATION

. COUNTERSIGNED Dr. Ratna Sanyal THESIS ADVISOR Dr. U. S. Tiwary DEAN ACADEMIC

(3)

INDIAN INSTITUTE OF INFORMATION TECHNOLOGY

Allahabad

(Deemed University)

(A Center of Excellence in Information Technology Established by Govt. of India)

CERTIFICATE OF APPROVAL

The foregoing thesis is hereby approved as a creditable study in the area of

Information Technology carried out and presented in a manner satisfactory to

warrant its acceptance as a pre-requisite to the degree for which it has been

submitted. It is understood that by this approval the undersigned do not

necessarily endorse or approve any statement made, opinion expressed or

conclusion drawn therein but approve the thesis only for the purpose for

which it is submitted.

COMMITTEE ON FINAL EXAMINATION FOR EVALUATION OF THE THESIS

(4)

Declaration

This is to certify that this thesis work entitled “

A Tool for Software Project

Management for Estimation, Planning & Tracking and Calibration”

which

is submitted by me in partial fulfillment of the requirement for the

completion of M.Tech. in Information Technology specialization in Software

Engineering to Indian Institute of Information Technology, Allahabad

comprises only my original work and due acknowledgement has been made in

the text to all other material used.

Nilesh Chandra Shukla M.TECH (Information Technology) Specialisation in Software Engineering

(5)

Abstract

Software engineering is the discipline which paves the roadmap for

development of softwares within given schedule and effort and with the

desired quality. The process begins with estimating the size, effort and time

required for the development of the software and ends with the product and

other work products built in different phases of development. Model based

technique is one of the best techniques used for estimation. The technique

uses different parameters for estimation. For the estimates to be accurate,

these parameters needs to be stratified with the organizations' past projects'

experience, failing to which leads to wrong estimates and consequently results

in software crisis. Handling large volume of data for these processes is a

tiresome task.

The tools available for automating some of the activities are great help in

the whole development process. However these tools isolate the process of

estimation, planning & tracking and calibration. Secondly Software

Engineering is a nascent discipline and still the metrics introduced for

quantifying the attributes of softwares are not sufficient enough to brush off

experts' judgments.

The goal of the thesis is to develop a web based tool for integrating the

estimation (using COCOMO II model), planning & tracking and calibration

process. The emphasis of the calibration process is to combine experts'

judgments and organization’s past projects' experience.

(6)

Acknowledgements

Completing a task is never one-man effort. It is often the result of valuable

contribution of a number of individuals in direct or indirect manner that

helps in shaping and achieving an objective. This thesis would have been

nothing if I didn’t have help and inputs from my supervisor Dr. Ratna Sanyal.

Her direction, supervision and constructive criticism were indeed the source

of inspiration for me.

It has been a privilege to study at Indian Institute of Information

Technology, Allahabad. The first and the foremost person who comes into my

mind to express my deep sense of gratitude whole heartily is Dr. Winsor

Brown, Assistant Director, USC Center for Software Engineering, Lecturer,

Dept. of Information Technology. He was there to help me out through the

thick and thin of this thesis.

I express my indebtedness to my batch mates for the constant

encouragement given throughout the thesis. Some of them need special

mention. Anand Arun Atre and Imran Khan were always there to help me

when I was succumbed to mathematical intricacies of the thesis. Kamal Sawan

and Vineet Chauhan are the names important for me when it comes to

logical design and programming of the tool. Pankaj Kandpal and Dinh Ngoc

Lan were the masters of database without whom database design was

impossible for me. Abhay Pawane, Sampath Mada, Prabhat Singh Saheja and

(7)

of hurdle in the thesis. There is a large and continuing debit owed by me to

Dhirendra Pratap Singh, Mahindra Giri Vasi Reddy and Alkesh Patel who

even not being of my branch sought time for teaching me MatLab.

I also express my deep sense of gratitude to Mr. Balwant Singh for his

efforts in giving timely support for hardware and software requirements.

Lastly I would like to express my gratitude to my parents for their unbound

love and priceless support throughout my life.

Nilesh Chandra Shukla

25-jun-2007

(8)

TABLE OF CONTENTS DECLARATION ______I ABSTRACT II ACKNOWLEDGMENT _III 1 INTRODUCTION ... 1 1.1 THE CURRENT SCENARIO... 2

1.2 DRAWBACKS IN THE CURRENT SCENARIO... 4

1.3 THE PROPOSED SOLUTION OVERVIEW... 5

1.4 BENEFITS OF PROPOSED SOLUTION... 5

1.5 SYSTEM OVERVIEW... 6

1.6 SYSTEM DEVELOPMENT PHASES... 7

1.6.1 Phase I... 7

1.6.2 Phase II... 8

1.6.3 Phase III ... 8

1.7 SYSTEM DEVELOPMENT SCHEDULE... 9

1.8 DEVELOPMENT ENVIRONMENT... 10

1.8.1 Softwares Required... 10

1.8.2 Hardware Required ... 10

1.9 LIST OF FUNCTIONALITIES TO BE PROVIDED BY THE PROPOSED SOLUTION... 11

1.10 NON-FUNCTIONAL REQUIREMENTS... 11

1.11 CHAPTER SUMMARY... 12

2 LITERATURE REVIEW ... 13

2.1 PROJECT MANAGEMENT... 13

2.2 PURPOSE OF PROJECT MANAGEMENT... 13

2.3 DESCRIPTION OF ESTIMATION... 15

2.3.1 Size Estimation ... 15

2.3.2 Effort Estimation ... 16

2.3.3 Schedule Estimation ... 16

2.3.4 Cost Estimation ... 17

2.4 DESCRIPTION OF PROJECT PLANNING... 17

2.5 MONITORING THE PROJECT (TRACKING)... 17

2.6 PURPOSE OF CALIBRATION... 18

2.7 FUNCTION POINT ANALYSIS... 18

2.7.1 Overview... 18

2.7.2 Types of Function Points ... 19

2.7.2.1 Development Project Function Point Count... 19

2.7.2.2 Enhancement Project Function Point Count... 19

2.7.2.3 Application Project Function Point Count... 20

2.7.3 Function point counting process ... 20

2.7.3.1 Components in function point counting ...20

2.7.3.1.1 External Inputs... 20

2.7.3.1.2 External Outputs ...21

2.7.3.1.3 External Inquiry ... 21

2.7.3.1.4 External Interface Files ... 21

2.7.3.1.5 Internal Logical Files ... 21

2.7.3.1.6 File Type Referenced, Data Element Type and Record Element Type... 21

2.7.3.2 Six Step Counting Process ... 22

2.7.4 Benefits of FPA... 24

2.7.5 When Not to Use FPA... 25

2.8 COCOMOII... 25

2.8.1 Overview... 26

2.8.2 COCOMO II over COCOMO 81... 27

2.8.3 COCOMO II Models ... 28

(9)

2.8.3.2 Early Design...29

2.8.3.2.1 Converting function points to SLOC ... 29

2.8.3.2.2 Cost Drivers ... 30

2.8.3.3 Post-Architecture ...31

2.8.4 Adjustment for Reuse... 31

2.9 DESCRIPTION OF CALIBRATION... 33

2.9.1 Multiple Regression Method... 33

2.9.2 Bayesian Analysis... 34

3 APPLICATION DEVELOPMENT ... 37

3.1 ARCHITECTURE... 37

3.2 JAVA SERVER FACES... 39

3.3 DESIGN PATTERNS USED... 40

3.3.1 Data Access Object Pattern... 40

3.3.2 Singleton Pattern... 43

3.4 MODULE SEQUENCE... 44

3.4.1 Size Estimation Module ... 44

3.4.2 Effort Estimation Module ... 44

3.4.3 Calibration Module ... 44

3.4.4 Report generation Module... 44

3.4.5 Utilities ... 45 3.5 PACKAGE ORGANIZATION... 45 3.5.1 User ... 46 3.5.2 Utility... 46 3.5.3 MailerPkg... 46 3.5.4 ClientPkg ... 46 3.5.5 ProjectPkg ... 46 3.5.6 ModulePkg... 46 3.5.7 TaskPkg ... 47 3.5.8 ScaleFactorsPkg... 47 3.5.9 ActivityPkg... 47 3.5.10 Reports... 47 3.5.11 MasterValues ... 47 3.5.12 Calibration... 47 3.6 CHAPTER SUMMARY... 47

4 DATABASE DESIGN AND ORGANIZATION... 48

4.1 FUNCTIONAL TABLES DESIGN... 48

4.2 STANDARD VALUES TABLES DESIGN... 49

4.3 DESCRIPTION OF FUNCTIONAL DATABASE TABLES... 50

4.3.1 Projects... 50 4.3.2 Modules ... 51 4.3.3 Tasks... 52 4.3.4 Size_Details ... 52 4.3.5 Transaction_fp... 53 4.3.6 Developer ... 53 4.3.7 Cost_drivers ... 54 4.3.8 Scale_Factors... 55 4.3.9 Client ... 56 4.3.10 Linked_Content... 56 4.3.11 Activity ... 56 4.3.12 Productivity_of_EM... 57

4.4 DESCRIPTION OF STANDARD VALUES DATABASE TABLES... 58

4.4.1 Phase_Ratio... 58 4.4.2 Language... 58 4.4.3 Master_Effort_Multipliers... 58 4.4.4 Master _FP_Columns... 59 4.4.5 Master _FP_Rows ... 59 4.4.6 Master _FP_Map... 60

(10)

4.4.7 Master_FP_Values ... 60

4.5 CHAPTER SUMMARY... 60

5 STUDY OF OTHER TOOLS ... 61

5.1 COSTAR... 61

5.2 CONSTRUX ESTIMATE... 62

5.3 COCOMOII1999.0 ... 64

5.4 SLIM-ESTIMATE ... 65

5.5 COMPARISON OF THE TOOLS... 66

6 UNRIVALLED FEATURES OF THE TOOL ... 68

6.1 CALIBRATION USING BAYESIAN THEOREM... 68

6.2 TRACKING... 70

6.3 REPORT GENERATION... 72

7 A ROAD AHEAD ... 74

8 CONCLUSION ... 75

APPENDIX A DELPHI METHOD ... 77

APPENDIX B REGRESSION ANALYSIS ... 79

APPENDIX C PROJECT MANAGEMENT SOFTWARE- GUI... 81

(11)

LIST OF FIGURES

FIGURE 1.1:SOFTWARE ESTIMATION TECHNIQUES... 2

FIGURE 1.2:SYSTEM OVERVIEW... 6

FIGURE 2.1:TRIANGULAR RELATIONSHIP... 14

FIGURE 2.2:DEPENDENCIES OF EI,EO,EQ,EIF AND ILF ON DET,RET AND FTR... 22

FIGURE 2.3:SIX STEP FUNCTION POINT COUNTING PROCESS... 23

FIGURE 2.4:RELATIONSHIP BETWEEN PHASES AND ESTIMATION RANGES... 26

FIGURE 2.5:EFFECT OF REUSE... 32

FIGURE 3.1:MODEL-VIEW-CONTROLLER ARCHITECTURE... 38

FIGURE 3.2:JAVA SERVER FACES ARCHITECTURE... 39

FIGURE 3.3:DATA ACCESS OBJECT PATTERN... 40

FIGURE 3.4:IMPLEMENTATION OF DATA ACCESS OBJECT PATTERN IN THE TOOL... 41

FIGURE 3.5:SEQUENCE DIAGRAM FOR DATA ACCESS OBJECT PATTERN IN THE TOOL... 42

FIGURE 3.6:SINGLETON PATTERN... 43

FIGURE 3.7:ORGANIZATION OF PACKAGES IN THE TOOL... 45

FIGURE:4.1:FUNCTIONAL TABLES DESIGN... 49

FIGURE:4.2:MASTER VALUES TABLES DESIGN... 50

FIGURE 5.1:COSTAR FROM SOFTSTAR SYSTEMS... 62

FIGURE 5.2:OUTPUT OF CONSTRUX ESTIMATE... 63

FIGURE 5.3:COCOMO1999.0DEVELOPED BY UNIVERSITY OF SOUTHERN CALIFORNIA... 64

FIG 5.4:ESTIMATES GENERATED BY SLIM-ESTIMATE... 66

FIGURE 6.1:THE CALIBRATION PROCESS... 69

FIGURE 6.2:TRACKING PROCESS... 70

FIGURE 6.3:REPORT GENERATION PROCESS... 72

FIGURE A.1:SCHEMATIC REPRESENTATION OF DELPHI ESTIMATION TECHNIQUE... 78

FIGURE C.1:LOGIN SCREEN... 81

FIGURE C.2:CREATING A PROJECT... 82

FIGURE C.3:CREATING A MODULE IN THE PROJECT... 82

FIGURE C.4:CREATING A TASK IN THE PROJECT... 83

FIGURE C.5:ADDING A DOCUMENT IN THE PROJECT... 83

FIGURE C.6:RESULT OF ESTIMATION :THE ESTIMATES... 84

FIGURE C.7:CREATING A NEW USER... 84

FIGURE C.8:CREATING A NEW CLIENT... 85

FIGURE C.9:SEARCHING A PROJECT OR CLIENT... 85

FIGURE C.10:SETTING PREFERENCES FOR A USER... 86

FIGURE C.11:SETTING THE STANDARD VALUES FOR SCALE FACTORS... 86

FIGURE C.12:SETTING THE STANDARD VALUES FOR EFFORT MULTIPLIERS... 87

FIGURE C.13:SETTING THE VALUE OF EQUIVALENT SLOC PER FUNCTION POINT FOR LANGUAGES.87 FIGURE C.14:INPUT FORM FOR ENTERING THE DAILY WORK DONE BY EACH DEVELOPER... 88

FIGURE C.15:CHANGING THE PHASE RATIO FOR EACH PHASE... 88

FIGURE C.16:TRACKING USING GANTT CHART... 89

FIGURE C.17:TIME TAKEN BY EACH ACTIVITY... 89

FIGURE C.18:SELECTING PROJECTS FOR CALIBRATION... 90

FIGURE C.19:INPUT EXPERTS’JUDGMENT FOR CALIBRATION... 90

FIGURE C.20:INPUT EXPERTS’JUDGMENT FOR CALIBRATION... 91

FIGURE C.21:LIST OF CLIENTS... 91

(12)

LIST OF TABLES

TABLE 2.1: EQUIVALENT SLOC PER FUNCTION POINT COUNT FOR DIFFERENT

LANGUAGES ... 30

TABLE 2.2: RELATIONSHIP BETWEEN EARLY DESIGN COST DRIVER AND POST-ARCHITECTURE COST DRIVERS. ... 30

TABLE 4.1: PROJECT TABLE ... 50

TABLE 4.2: MODULE TABLE ... 51

TABLE 4.3: TASK TABLE... 52

TABLE 4.4: SIZE DETAILS TABLE ... 52

TABLE 4.5: TRANSACTION_FP TABLE... 53

TABLE 4.6: DEVELOPER TABLE... 53

TABLE 4.7: COST DRIVERS TABLE... 54

TABLE 4.8: SCALE FACTORS TABLE... 55

TABLE 4.9: CLIENT TABLE... 56

TABLE 4.10: LINKED CONTENT TABLE... 56

TABLE 4.11: ACTIVITY TABLE ... 56

TABLE 4.12: PRODUCTIVITY OF EM TABLE... 57

TABLE 4.13: PHASE RATIO TABLE ... 58

TABLE 4.14: LANGUAGE TABLE ... 58

TABLE 4.15: MASTER EFFORT MULTIPLIER TABLE... 58

TABLE 4.16: MASTER_FP_COLUMNS TABLE ... 59

TABLE 4.17: MASTER_FP_ROWS TABLE ... 59

TABLE 4.18: MASTER_FP_MAP TABLE ... 60

TABLE 4.19: MASTER_FP_VALUES TABLE... 60

TABLE 5.1: COMPARISON OF THE TOOLS ... 66

(13)

1

Introduction

Software engineering is the discipline that aggregates the application of scientific and technological knowledge through the medium of sound engineering principles, to the production of computer programs, to the requirements definition, functional specification, design description, program implementation, and test methods that lead up to test the code [1]. Software engineering is about engineering the software development process. It requires highest degree of analyses, hard work and the management of the two. With the increasing size and complexity of softwares; software development has become a more clamorous process and hence needs to take care of even the simplest activity in the development process. The problems being faced in the software developments are cost overrun, schedule overrun and quality degradation.

In the core of these problems lies the problem of poor estimation. Wrong estimation surely results a disaster in the development process. Effective estimation is essential for proper project planning and control and is one of the most critical and challenging task in the development process. Under-estimating a project leads to quality degradation, employee over exploitation and setting short schedule and hence results in missed deadlines. Over-estimating is even worse than the previous condition; allocating more resources to the project and thus increasing the cost of the project without any scope.

Proper planning of the project and tracking the project development is the second essential task for assuring the success of the project. Once the estimates are available the next task is to assign the tasks to individuals. Regular feed back from the development process is helpful in determining the status of the task and the

(14)

project. Tracking gives opportunity to the project manager to take care of any unexpected situation while development.

As stated earlier, estimation plays the key role in the management of the development process, it is essential that the model or the method being used should be correct and stratified with the most recent data available and if standard parameters are being used in the method then those parameters should be well calibrated with the available data.

This chapter comprises the discussions on the current scenario of typical software project management and drawbacks of the current scenario, proposed solution overview, benefits of proposed solution, system overview, system development phases, system development, schedule, and development environment.

1.1

The Current Scenario

In this section the current scenario for the software estimation, planning and tracking and calibration is discussed.

Figure 1.1 shows five categories of software estimation techniques in practice.

Figure 1.1: Software Estimation Techniques

Composite based- Bayesian- COCOMO II Software Estimation Techniques Expertise based -Delphi, Rule-Based. Learning based- Neural, Case-Based Regression based -OLS, Robust Model based -SLIM, COCOMO, SEER

(15)

Various estimation techniques have been developed in the past which follows mathematical model for estimation. SLIM (Software life cycle model), COCOMO (Constructive Cost Model), SEER (System Evaluation and Estimation of Resources) are some of the model based techniques for software estimation. Projects’ related data is used as input in these techniques and past projects’ data is used for calibrating the models.

When past projects’ data is not available then experts’ knowledge is used for estimation. Delphi and Rule-based techniques comes under this category. Delphi technique is based purely on the experts’ judgment whereas rule based technique is adopted from the artificial intelligence domain in which a set of rules work together to get the output i.e. the estimates.

A lot of work has been devoted for the development of learning based techniques for estimation. Neural networks defined by three entities neurons, interconnection structure and the learning algorithm, is one of the popular learning based technique. Case-Based technique is another kind of learning based techniques in which a database of completed projects is maintained and new project’s cost is estimated by comparing the new project with similar projects in the database.

Standard or ordinary least squares (OLS) method and Robust regression are the regression methods used for estimation. Robust regression resolves the most common problem of outliers in software engineering data.

Model based techniques are most widely used in the industry due to its independence to any previous information and due to the fact that it works on certain parameters pertinent to the model, being used in the estimation. In the model based techniques the values for different standard parameters are fetched according to the project being developed and using the equations defined in the model the estimates are calculated. Various tools are available in the market for automating the process of estimation. To name a few are Construx estimate, Costar 7d, QSM [2, 3, and 4].

(16)

While planning the project development, the estimates and the productivity of the developers are considered as the baseline and the task is assigned to developers according to their abilities. Developers have to give the information about the work done by them on a daily basis. The information is used to track the status of project and answers the question “How much task has been completed till now?”

The model based technique is based on the parameters of the method being used. These parameters need to be calibrated according to the past data available for different projects in the organization. Calibration has importance because it is going to affect the overall process in future and hence needs great care. The data collected is first checked for consistency, correctness and completeness. And then the approved data is used for calibrating the parameters of the model and new values are assigned to the parameters.

1.2

Drawbacks in the Current Scenario

Study of tools [2, 3, 4] has revealed the following drawbacks in the current scenario.

1.

Tools available for the above activities are isolated to each other i.e. the tools available are either estimation tools or for planning and tracking.

2.

The tools available for planning used to send the information of task assigned to individuals through mails and the information pertinent to the assigned task is kept in some version control system.

3.

Any supporting documents or reports should be available to the person in the organization like SRS for the project, design specification. Current tools do not have this feature.

4.

During the development, the management needs to keep track of information about the status of project; the tools available do not have such features.

5.

Reports at any stage of development are needed another important feature absent in available tools.

6.

While calibration, past projects’ data need to fetched manually.

7.

The method used for calibration of tools does not incorporate the expert’s judgment in the resulting parameter values.

(17)

1.3

The Proposed Solution Overview

The major problem in the current scenario is the isolated estimation, planning & tracking and calibration, so the solution would be Project Management Software that will combine these activities. The proposed system will first stores the details of the projects, clients and developers which are right now in paper form or if available in electronic form are in isolation to each other. The information about the projects, clients, developers would be available easily. The system will automate the process of the estimation using the COCOMO II model [5] for effort estimation. The system will also help in tracking the status of project by taking daily input from each developer in the organization and will show the status in the form of a Gantt chart. The system will generate the reports for the projects. While calibrating the model the system will incorporate the experts’ judgment in the final values of parameters of the model. The system will give the information about the activities in the organization and the time taken in each activity.

1.4

Benefits of Proposed Solution

1.

Clumsy calculation for estimation is no longer needed.

2.

Planning and tracking would rather be a simpler task.

3.

Information about the projects, clients and developers are no longer needed to be stored in other forms.

4.

Activity details would be available easily.

5.

The reports could be generated with a single mouse click.

6.

Notification on various conditions can be customized according to the users’ choice.

7.

Data for calibration would be available in the tool itself and no manual data entry is required for calibration.

8.

The calibration would be more accurate and hence the estimation too.

9.

With all the information available management will have an edge in improving the conditions in the project development.

(18)

Size & Effort Estimation Scheduling SIZE EFFORT REPOSITORY FPA AND COCOMO II factors Daily Work done Tracking Admin Calibration Old Project Data New Values for factors Project details Developer

10.

Solution will be available at low cost.

11.

The system could be extended to meet any future requirement easily.

1.5

System overview

The proposed system is devised by using COCOMO II method for estimation called PROJECT MANAGEMENT SOFTWARE and the Bayesian method for calibration of the model. Bayesian analysis is used for testing the hypotheses’ on the basis of available sample data. Figure 1.2 shows the system at a glance. The modules in the diagram are:

1.

Size and Effort Estimation Module

2.

Scheduling Module

3.

Tracking Module

4.

Calibration Module

Figure 1.2: System Overview

All the information about the projects is fetched to the estimation module. It stores the information in the database for estimation and for future use. The module with the help of the stored standard values of the parameters used in the COCOMO

(19)

II model calculates the size estimates in source lines of code (SLOC), Function points or in Adapted source lines of code (for reusable component) and effort estimates in person-months for the intended project.

Scheduling module uses the estimated size and effort values for the estimation of the time required for the given project in months.

Tracking module takes input from the developers daily about the work done by them on the task assigned to them. The input is stored in the database is used for comparison between the estimated time and actual time.

Once past projects’ data is available, the standard values of parameters can be calibrated through calibration module. The calibration module uses the experts’ judgment obtained through the Delphi method [6] and the data obtained from the regression method i.e. sample data and then the sample data and experts’ judgment are combined using the Bayesian analysis.

1.6

System Development Phases

The proposed System has three development phases.

1.6.1

Phase I

Phase I was dedicated to the database design, designing the system and for developing the part which estimates the size, effort and schedule for the project along with the programs for inserting the data into the backend and for its manipulation. Major work was donein this phase due to intricacies in the estimation model.

An interactive and user friendly interface with an accurate estimation model was the goal of this phase.

(20)

1.6.2

Phase II

Being estimation model at its place the next point of focus was development of planning and tracking module. Phase II was concerned about developing system for taking inputs from the developers and comparing them and showing them in useful forms such as Gantt charts and Bar charts.

Dividing task into activities and then showing details of the status of the task was the purpose of this phase.

1.6.3

Phase III

The last but the most important phase was phase III with the implementation of calibration method using the regression analysis [7] and Bayesian analysis. An effective sum up of the experts’ judgment and the sample project data available was the purpose of this phase.

(21)

1.7

System Development Schedule

The schedule followed for the development of tool is given below.

MILESTONES JUL₂₀₀₆, AUG₂₀₀₆, SEP,₂₀₀₆ OCT,₂₀₀₆ NOV,₂₀₀₆ DEC,₂₀₀₆ ₂₀₀₇JAN, FEB,₂₀₀₇ MAR,₂₀₀₇ ₂₀₀₇APR, MAY,₂₀₀₇

Literature Survey, Study of FPA and COCOMO II Requirement Analysis Architecture Detailed Design Implementation of the tool and Unit Testing

Integration and Testing Final

Documentation

Shows Planned Activity Shows Actual Activity

(22)

1.8

Development Environment

1.8.1

Softwares Required

The softwares used in the tool are as following:

1.

Java 5 [8]

2.

Java server faces [9]

3.

MySql-4.1.14 [10]

4.

MySQL -connector- 3.1.12 [11]

5.

Jfreechart-1.0.2 [12]

6.

chartcreator-1.2.0-RC1 [13]

7.

Tomahawk-1.1.1 [14]

8.

Jdom-1.0 [15]

9.

Struts-1.3.5 [16]

10.

JMatlink130 [17]

11.

Matlab

All the softwares except Matlab mentioned above are freely available.

1.8.2

Hardware Required

The minimum hardware requirement for the software is:

1.

Intel or AMD mother board

2.

Pentium IV or above processor

3.

1 GB RAM and

(23)

1.9

List of Functionalities to be Provided by the

Proposed Solution

1.

Client Management

Management of client details like the name, URL, contact number, address, e-mail address, projects from the client and the details of the projects under clients.

2.

Developer Information Management

Management of information about the developer and the task assigned to the developer. It includes the list of projects, modules and tasks assigned to the developer, date of task assignment, and status of the task and clients name.

3.

Estimation

Estimation of Size, Effort and Time for the given project.

4.

Report generation

Report creation according to the given project or according to the clients.

5.

Tool’s Calibration

Calibration of model used in the tool, using datafrom the past project and the judgment of the experts.

6.

Activities summary

Summary of the activities (details of activities is given in chapter 5) and the time given to each activity in the organization.

1.10

Non-Functional Requirements

1.

Security

The tool has three privileges and only authorized person can access the facilities in the tool through his/her user id and password. The password changing function would be accessible to the individuals for their own passwords and the administrator in the tool will have privilege for changing others password in case of the person has forgotten the password.

(24)

2.

User friendly

The system has a user friendly interface for accessing various functionalities in the tool. Minimum training time is required for the tool.

3.

Reliability

The system gives accurate result for the given input and if the given input is not correct then the system gives alerts for them.

4.

Maintainability

The system is open for additions in the future.

5.

Performance

Since the system is a web application thus its performance should be high. Performance issues are handled in the development of the tool.

1.11

Chapter Summary

In this chapter current trends in estimation, planning and tracking and calibration; drawbacks in current scenario along with the proposed solution overview and its benefits has been discussed. Overview of the System developed as a result of thesis and its functional and non-functional requirements is presented in the chapter. The chapter also contains concise details of System development phases and hardware and software requirements for the tool.

Next chapter contains discussion about project management, its purpose, the estimation, planning, tracking and calibration. The chapter also contains excerpts of function point analysis, COCOMO II model and the regression analysis and Bayesian method for calibration of COCOMO II.

(25)

2

Literature Review

The chapter discusses project management, its purpose, the estimation, planning, tracking and calibration. The chapter also has a concise description of function point analysis, COCOMO II model and the regression analysis and Bayesian method for calibration of COCOMO II.

2.1

Project Management

A project is an effort put towards achieving an objective. Its mission is to outcome as a constructive product or service. Project Management is the organization and management of resources in such a way that all the work required to complete a project can be done within defined scope, quality, and time and cost constraints [18].

2.2

Purpose of Project Management

Resources and activities are the key players in any organization for completion of any project. The purpose of project management is to first find out the activities needed to take the project to its end and secondly to allocate resources to these activities in a planned way. The word project management is a combination of following activities [18]:

1.

Analysis & Design of objectives

2.

Organizing the work

3.

Estimating resources

4.

Planning the work or objectives

5.

Allocation of resources

6.

Acquiring human and material resources

7.

Assigning tasks

(26)

Time

Project Management

Effort _Quality

9.

Controlling project execution

10.

Tracking and reporting progress

11.

Analyzing the results based on the facts achieved

12.

Defining the products of the project

13.

Forecasting future trends in the project

14.

Quality Management

15.

Issues Management

16.

Issues solving

17.

Defect prevention

18.

Project Closure meets

19.

Communicating to stakeholders

Project management is a vast area which includes all the activities in the above list. The scope of thesis is limited to project estimation, planning and tracking.

The triangle of relationship of the project management is shown in figure 2.1.

Figure 2.1: Triangular Relationship

Quality, effort and time are inter-related. If the project demands a higher quality then it is going to use more resources and the effort required will be high and the effect will percolate to time. The first challenge that project management faces is to ensure that the project is delivered within time and budget and with the desired

(27)

quality. The second challenge is more crucial and grueling one for optimizing the resource requirements. These challenges make the project management a taxing and conspicuous task for any organization.

2.3

Description of Estimation

Management in any project starts with estimation. An effective estimation is the back bone for the development of any project. Without effective estimates proper project planning and tracking is impossible. If the estimates are too low then the project management tries to employ more personnel in order to expedite the development process; that eventually results in poor quality product and employee dissatisfaction [19].

Basic steps in software estimation are as follows:

1.

Estimation of the size of the intended project. This results in either source lines of code (SLOC) or function point counts (FPC) or new object points (NOP) for the project but other measures for the size are also available.

2.

Estimation of the effort for the project in man-months or man-hours.

3.

Estimation of the schedule in calendar-months.

4.

Estimation of the cost in local currency.

2.3.1

Size Estimation

A sound size estimate could be a good foundation for the software estimation. The information source for estimation can be the project proposal, system specification or software requirement specification. If the size estimation is being done in the later stages such as design or during coding, then design specifications and other work products can be used as information source for estimation [19].

(28)

1. By Analogy: If similar projects have been experienced by the organization then with the help of past experience the size for the new project can be estimated. This is performed by dividing the new project into small modules and comparing those modules with the past project data. This method can give almost the accurate estimate for the project size if the past projects were similar to the new one [19]. 2. By Parametric Measurement: The size could be estimated by counting features

of the project and using them as parameter for any parametric measurement approach like object point analysis or function point analysis. Even if the organization has no experience of the intended project, the features of the project can be used for parametric measurement.

2.3.2

Effort Estimation

Once the size estimates are available, effort can be estimated for the project. The ways by which effort could be estimated for the project are as follows [19]:

1. By Using Past Projects’ Data: The best way to derive the estimates for the project is to use data of the past projects. For this approach it assumed that the organization maintains the data properly and documenting the relevant information. This approach also assumes that the organization has done similar projects earlier.

2. By Using Parametric Measures: If data of the past projects is not available then parametric models can be used for effort estimation. These models consider the features of the new project and use the standard values for these parameters for calculating the size of the software.

2.3.3

Schedule Estimation

Estimation of schedule includes the number of people who will work on the project, what work they will do, what are the start time and end time for them. Once the effort estimates are available schedule can be laid out in calendar months.

(29)

Schedule in months = 3.0 * (effort-months)

1/3

Or parametric models like COCOMO can be used for estimating the calendar schedule.

2.3.4

Cost Estimation

Software cost estimation includes many factors to be pondered like hardware cost, labor cost, tools cost etc. How the cost would be estimated depends on the organization. Labor cost for developing the software constitutes the major portion of the total cost. Once we have effort in man-months we can calculate the cost of the software using the salary of the individual employee employed in the project.

2.4

Description of Project Planning

Projects are expensive in terms of both time and money. Ineffective planning may take decades to complete a project with mediocre complexity. We can do careful planning before and during the development of the project. That planning helps in avoiding serious mistakes. After the first phase, when requirements collection for the project is over; the next step is to identify the dependencies among the various modules and tasks, and to pave a road map for the development process. Assigning right task to the right person is a major challenge in this phase. Available estimates play a key role in whole planning process by providing the information about the time and effort required for the project and for various tasks in the project.

2.5

Monitoring the Project (Tracking)

When project is under development it is necessary to take feedback from the development process and analyze the status of project. This helps in detecting any problem occurred during development or any schedule or cost slippage and signals the project management about the problem so that necessary actions could be taken to rectify the problems.

(30)

While tracking the status of the projects, the estimated values are compared with the actual values collected during development. Gantt charts are the most widely used tool for such analyses.

2.6

Purpose of Calibration

Model based techniques use various parameters, pertinent to the proposed software, for the estimation of the project size, effort and time required for the development of the project. For example COCOMO II uses 5 scale factors (Precedentedness, Development flexibility, Architecture/Risk resolution, Team cohesion and Process maturity) and 17 effort multipliers in the estimation of effort for any project [5]. These factors have certain predefined values and these values are used in the estimation process. These values were given by the developer of model on the basis of the study of the projects available when model was being designed. Every organization has its own set of process and the standard values for these factors, which varies from company to company. Hence it is necessary to stratify the parameters value of the models according to the data of the past projects of those particular organizations. The complete process is known as calibration. The more past data we have for the calibration more accurate the estimates would be.

2.7

Function Point Analysis

The section constitutes the description of function point analysis, types of function points, function point counting process, benefits of Function Point Analysis (FPA) and a comparison between the traditional SLOC method and the FPA.

2.7.1

Overview

As the system grows in size, it is really hard to estimate the size of the software early in the development. Divide and conquer has been the best strategy for tackling bigger problem for decades. Function point analysis, introduced by Allan J Albrecht

(31)

of IBM in late 1970s, follows the concept of divide and conquer strategy for estimating the size of any software [19].

FPA breaks the system into smaller pieces so that intricacies of the systems become more visible and can be analyzed better. Function point analysis measures size of the software on the basis of the functionalities to be provided by the software. The method quantifies the functionalities of software by the information provided by the user based on logical design. FPA estimates the size of softwares in terms of function point counts (FPC) which can be converted into SLOC easily if the equivalent SLOC for unit FPC is available.

2.7.2

Types of Function Points

Size estimation is critical issue for all kind of projects i.e. for development, enhancement and application projects. On the basis of the categories of the projects function points can also be categorized into following categories [19]:

2.7.2.1

Development Project Function Point Count

When the project is under development, the amount of information about the project varies from phase to phase. Development Project Function Point Count is useful for the projects under development. It can be used in any phase. Using FPA in every phase of development allows tracking of size overrun.

2.7.2.2

Enhancement Project Function Point Count

Every software goes through the enhancement stage either addition of new functional requirement or a non-functional requirement. Enhancement project function point count tries to estimate the size of enhancement projects. It helps in understanding the movement of a project from development stage to enhancement stage.

(32)

2.7.2.3

Application Project Function Point Count

Once the application is developed function points can be calculated to make baseline for future uses. It can be used for predicting maintenance size.

2.7.3

Function point counting process

2.7.3.1

Components in function point counting

The section gives a high level view of the steps for counting the function points. Conceptually function point analysis defines data in two levels; data at motion and data at rest [19].

Every application has numerous elementary processes which includes various transactions for data movement. It includes transactions bringing data into the application domain and transactions taking data out of the application domain. These are referred as transaction functions.

The data maintained by the application or by another application are known as data at rest and referred as data functions.

Following are the types of data and transaction functions:

2.7.3.1.1 External Inputs

External Inputs (EI) is the process in which data comes from outside of the application domain. The data may come from the input screen or from other application. Control and business data both are counted as EI. The input can manipulate one or more files maintained by the application. If an input is performing insertion, updation and deletion then it is counted as three external inputs.

(33)

2.7.3.1.2 External Outputs

The process in which any derived data crosses the boundary of application from inside to outside is known as external outputs (EO). Derived data here means the processed data not the data through simple retrieval from the external interface files or internal logical files. It usually is result of some calculation or algorithmic operation.

2.7.3.1.3 External Inquiry

The process with both input and output components which retrieves data either from the internal logical files or from the external interface files. External inquiry (EQ) does not update any internal logical files or external interface files.

2.7.3.1.4 External Interface Files

User identified logically related data stored outside the application boundary is known as external interfaces files (EIF). The file containing the logically related data can be counted as external interface files or internal logical files but not both. Each EIF should have at least one EI or EO for it.

2.7.3.1.5 Internal Logical Files

User identified logically related data maintained inside the application through external inputs in known internal logical files (ILF). These files should have at least one external input for it.

2.7.3.1.6 File Type Referenced, Data Element Type and Record Element Type

File type referenced (FTR) is a file reference by any transaction. It should be either an internal logical file or external interface file. Data element type (DET) is unique information in FTR. DET could be information for the instigation of any information or could be additional information about the transaction. Record element type (RET) is a unique sub group of data in FTR. DET, RET and FTR are used in the calculation of number of EI, EO, EQ, EIF and ILF. Dependencies of EI, EO, EQ, EIF and ILF on DET, RET and FTR are shown in figure 2.2.

(34)

DET RET FTR EI √ √ EO √ √ EQ √ √ ILF √ √ EIF √ √

Figure 2.2: Dependencies of EI, EO, EQ, EIF and ILF on DET, RET and FTR

2.7.3.2

Six Step Counting Process

Required information for counting is obtained from the software requirement specification. The steps for counting function points are as following:

1.

Identify data functions (External Interface files and Internal Logical Files) and rate them.

2.

Identify transaction functions (External Input, External Output and External Inquiry) and determine there complexity.

3.

Compute unadjusted function points. Number of EI, EO, EQ, ILF and EIF for each complexity level (Simple, Average and Nominal) is obtained and the corresponding weight for each complexity level is multiplied with the count to finally get the unadjusted function point count. Details of function point count are available in appendices.

4.

Determine the ratings of 14 general system characteristics.

5.

Calculate value adjustment factor (VAF).

VAF = (TDI * 0.01) + 0.65

Where, TDI = Total Degree of Influence obtained by multiplying the ratings of general system characteristics.

(35)

FPC = UFP * VAF

Figure 2.3 shows the steps for counting function points.

Figure 2.3: Six step function point counting process Software

Requirements Specification

Identify data functions and determine the

complexity. Internal Logical Files External Interface Files

Identify transaction functions and determine

the complexity. External Input External Output External Inquiry

Compute Unadjusted Function Points

Rate 15 General System Characteristics

Compute Value Adjustment Factor (VAF)

(36)

2.7.4

Benefits of FPA

1.

Technology Independence

Function point estimates the size of software on the basis of the functionality provided by the software irrespective of the tool or technology used for its development. Languages like COBOL; FORTRAN can be used easily for development of any software but will have more SLOC than JAVA, VB etc. But the functionality provided by the software would be the same and hence the size in function point count would be same [20].

2.

Consistency and Repeatability

The rules defined by International Function Point User Group (IFPUG) have increased the consistency of FPA. Even if the person counting the FPC is changed the result will be same. Since the rules are well documented, the count can be repeated [20].

3.

Data Normalization

Property of function points of being dependent on functionalities rather than the SLOC has made FPA useful for normalizing data like cost, effort, schedule, staff, defects etc. For example using other measures like SLOC, time taken by two applications can not be compared because complexity of applications may defer and SLOC does not take the fact into account. Whereas in case of FPA it can be concluded that application 1 took more time than application 2 in implementing 1 function point count [20].

(37)

4.

Estimation

Since function points needs only the details about the functionalities of the software makes it useful early in the development than compared to SLOCs [20].

5.

Beneficial for Managers

Function point helps project managers to dig the project up to a greater depth and to define scope of system more accurately. It also helps project managers to communicate clients the cost of the enhancement and the change proposed by them [20].

2.7.5

When Not to Use FPA

Function points are not suitable measures for maintenance work. Maintenance work is less in a development work and more in an interrogation. This means understanding the existing product becomes a major work rather than adding new things to it or making some corrective modifications while maintenance. This depends more on individual skills. A highly skilled person takes less time in understanding the existing code and identifying the problem in it. Whereas FPA has nothing to do with performance but the functionality hence are not useful under such circumstances [19].

2.8

COCOMO II

Budgeting, planning and tracking, risk analysis and return on investments analyses are some of the uses of software cost, schedule and effort estimation. COCOMO II is one of the most widely used parametric models, for effort and schedule estimation. The section contains a short description of details of the COCOMO II model and its comparison with COCOMO 81[5], effort estimation process.

(38)

2.8.1

Overview

COCOMO II was first published in Annals of Software engineering in 1995. Purpose behind the research of COCOMO II was to accommodate the development culture of new generation i.e. COTS (Commercial of the Shelf), use of reusable components, rapid development processes. Figure 2.4 shows the range of effort estimates at different stages of development when COCOMO II is used [22].

The estimates obtained at the feasibility analysis phase may vary by a factor of 4

and as soon as development progresses information becomes finer and thus increases the accuracy of the estimations [5].

Figure 2.4: Relationship between phases and Estimation Ranges Feasibility Analysis Requirement Analysis High Level Design Detailed Design Development and Testing 4x 2x 1.5x 1.25x x 0.75x 0.5x 0.25x Estimation Range s

(39)

2.8.2

COCOMO II over COCOMO 81

COCOMO 81 was the model of 1980s. This section compares both the flavors of COCOMO [5].

1.

In the era of COCOMO 81, softwares were developed with a limited scope and reusability was not a popular concept and hence there was no such concept in COCOMO 81 to accommodate these new features. COCOMO II incorporates the features mentioned and adjusts the estimates for reuse.

2.

The estimation model needs to be consistent with the information available for the projects. In COCOMO 81 there is only three models organic, semi-detached and embedded and these models describe the nature of projects. These models do not give any explanation about the phase of the development. COCOMO II has three models application composition, early design and post-architecture to be used according to the phases of development.

3.

COCOMO 81 gives output in the form of an exact value, which in most of the cases is not accurate. COCOMO II gives output in the form of ranges (optimistic, pessimistic and most likely) according to the phases of development, which is a better way to plan the development process.

4.

B is the constant used in both versions of COCOMO. In COCOMO 81 B is a constant value that depends on the type of project (organic, semi-detached, and embedded). Whereas in COCOMO II B is the result of equation containing five scale factors.

5.

The earlier version has 15 cost drivers for rating various attributes of the intended software whereas in the other version 17 effort multipliers are present.

(40)

2.8.3

COCOMO II Models

Development market in future can be divided into following categories [5]:

1.

End-User programming: Increased literacy has increased the number of end

users. New tools available in market allows user to develop there own software for simple uses or for information processing. Some examples are spreadsheets, query browsers, planning tools etc.

2.

Application Generators: The area which generates the readymade solution

which need to be customized according to user.

3.

Application Composition: The problems which can not be solved through

single prepackaged solutions needs to be generated by combining different reusable components. Such development comes under the category of application composition.

4.

System Integration: Large scale softwares requiring high degree of system

engineering and can not be generated by application composition comes under this category.

5.

Infrastructure: The area concerned with the development of operating system,

database management systems etc. comes under this category.

The first category (end user programming) does not need COCOMO II for estimation because its applications are easy to develop with very low complexity and can be developed within hours. For other four sectors COCOMO II has three models of estimation.

(41)

2.8.3.1

Application composition

This model is useful for application which can not be generated through application generators but can be created by combining prepackaged solutions [5]. Examples are GUI builders, query browsers, database managers etc.

The model uses object points for size estimation. It estimates the size of any tool on the basis of the number of screens, reports and 3 GL components. The output of object point analysis is number of object points.

Person-Month for the application can be calculated as:

Where,

PM = effort in person-months NOP = new object points

PROD = Developers experience and capability

2.8.3.2

Early Design

This model can be used for application generators, system integration and for infrastructure development sectors. The model is used early in the development when very little is known about the project. The model uses unadjusted function points for the size estimation. Size estimation using function point is explained in chapter 4.

2.8.3.2.1 Converting function points to SLOC

COCOMO II early design and post architecture model use SLOC in effort estimation. Hence the unadjusted function points need to be converted into equivalent SLOC. This conversation is performed on the basis of available table for different languages such as Table 2.1 given below:

NOP

PROD

PM =

(42)

Table 2.1: Equivalent SLOC per function point count for different Languages

Language Equivalent SLOC

C 75 C++ 53 COBOL 107 DELPHI 5 18 HTML 14 JAVA 2 46 SQL DEFAULT 13 VISUAL BASIC 6 24 2.8.3.2.2 Cost Drivers

COCOMO II uses 17 cost drivers for adjustment of effort. Early Design model uses a reduced set of cost drivers in equation 1. These cost drivers are obtained by combining different cost drivers of post-architecture model. If the ratings of cost drivers are between two levels, the rating near to nominal is selected i.e. if the rating of any driver is between very low and low then low is selected. The Table 2.2 below shows the relationship between early design cost drivers and there post architecture counterparts.

Table 2.2: Relationship between early design cost driver and post-architecture cost drivers.

Early Design Cost Drivers Post-Architecture Counterpart

RCPX (Product Reliability and Complexity)

RELY, DATA, CPLX, DOCU

Required Reuse (RUSE) RUSE

Platform Difficulty (PDIF) TIME, STOR, PVOL Personnel Experience (PREX) AEXP, PEXP, PCON

Facilities (FCIL) TOOL, SITE

Schedule (SCED) SCED

(43)

2.8.3.3

Post-Architecture

This model is suitable for application generators, system integration and infrastructure development sector. It has same granularity as of COCOMO 81 and uses all the 17 cost drivers for estimation. The model uses unadjusted function point and source lines of code as size measures. Scale factors are used in both the early design and post architecture model in the same form.

Concisely COCOMO II has three models for application generators, system integration and infrastructure. For the early phases of spiral model where prototyping is one of the major activity application composition is most suitable model. The next phase includes exploring design alternatives and better quality of data is available. Early design model fits into this criterion. The phase when the project is ready to develop and maximum information is available for determination of the values of cost drivers, then post-architecture model is the best option.

2.8.4

Adjustment for Reuse

COCOMO II adjusts the nominal effort for the reuse by adding size to task. Function points or source lines of code are as the size metrics for adjusting reuse. Early design and post architecture both follows the same method for adjusting reuse.

Figure 2.5 shows the effect of reuse on the cost of the software. The dotted line shows the usual assumption about the relationship between cost and amount of modification in the reusable component and solid line shows the original relationship between the two. The diagram shows that the line does not start from the origin but some point above the origin. This difference is because of the assessment and assimilation effort required for the reusable component [5].

(44)

Effect of Reuse

Figure 2.5: Effect of Reuse

COCOMO II has the feature of ASLOC (Adapted Source Lines of Code) for estimating the reuse along with three modification factor Design Modified, Code Modified and Interface Modified. Other factors affecting the estimates for reuse are as following:

1.

SU (Software Understanding): Describes the structure, application clarity and self-descriptiveness of the component being reused.

2.

AA (Assessment and Assimilation): It shows the amount of work required for searching the appropriate component for the reuse. It includes test and evaluation effort for the component.

3.

UNFM (Unfamiliarity with the component).

Effort estimates for reuse is calculated with the following formulae:

AAF = 0.4(DM) + 0.3(CM) + 0.3(CM) 0.25 0.5 0.75 1.0 0.25 0.5 0.75 1.0 0.046 Amount Modified Relat ive C o st

Usual Linear Assumption

ESLOC = ASLOC[AA + AAF( 1 + 0.02(SU)(UNFM))]

100 ,AAF ≤ 0.5

(45)

5 i=1

17 i=1

Where,

AAF = Adaptation Adjustment Factor ESLOC = Estimated Source Lines of Code

2.9

Description of Calibration

Since the scope of thesis is limited to the COCOMO II model, only the method used for the calibration of the COCOMO II model is discussed in this section.

2.9.1

Multiple Regression Method

Multiple regression [22] is the method used for curve-fitting. It expresses the output in the form of n predictor variables. The co-efficient of variables are then estimated using the least square method. A regression model can be represented as:

y

t

= α

0

+ α

1

x

t1

+ α

2

x

t2

+ ………. + α

n

t

nk

Where xt1 ... xtn are the values of predictor variables for the tnth observation, α0 ... αn are the co-efficient to be estimated. yt is the output for tnth

observation.

COCOMO II model has the following form:

Effort = A X [

Size

]

1.01+∑ SFi

X

∏

EMi

Equation 1

Where,

A = multiplicative constant

Size = Size of the software in thousands of source lines of code. SF = Scale factors

EM = Effort Multipliers

The above equation can be turned into the linear equation by taking logarithm on both the sides.

(46)

ln (

Effort) = α

0

+ α

1 .

1.01

.

ln(Size) + α

2 .

SF

1 .

ln(Size)

+…+ α

6 .

SF

5 .

ln(Size) + α

7.

ln(EM

1

)

+………+ α

23.

ln(EM

17

)

Regression method for calibration is effective [22] when:

1.

The observations available for the use should be large relative to the number of variables in the model. Data collection has always been a challenge for software engineering. The main cause for this is immature processes and lack of cost related data released by the organizations.

2.

The observations are free of outliers. Opposite to this, Software Engineering is full of unexpected cases and has a large number of outliers.

3.

Independent variables are not highly correlated to each other. The cost related data is obtained from past projects and not by experiments hence the correlation among them is high.

COCOMO II model does not satisfy all the above conditions. This results in incorrect calibration of the model and hence finally leads to wrong estimates.

2.9.2

Bayesian Analysis

Software engineering is a new and emerging field and the metrics available for the measurements of software attributes are not complete and are not as accurate as the metrics in other area. Secondly the data collection has always been an activity done in sloppy way. Under these circumstances experts’ judgment can not be disregarded or we can say is even more important in case of estimation. Bayesian analysis combines the output of the regression method and the experts’ judgment and thus helps in the calibration process.

Bayesian analysis is used for testing the truthfulness of any hypothesis. In this method observations are used for making the decision whether the hypothesis is true or false. General form of Bayes’ theorem is following [23]:

(47)

P(H

0

|E) = P(E|H

0

)

.

P(H

0

)

P(E)

Where,

H0 is the hypothesis called null hypothesis that has to be tested against the

observations.

P(H0|E)

=

Posterior probability of the hypothesis

P(E|H0)

=

Conditional probability of the evidence when H0 is given.

P(E)

=

Marginal probability i.e. the probability of E under all the mutually

exclusive hypotheses.

P(H0)

=

Prior probability of H0.

This method can be used for incorporating the experts’ judgment with the value obtained from the past data. The final values would be the combination of the two sources of information i.e. the sample data and the experts’ judgment [22].

Posterior = Sample

X

Prior

Sample information would be the values estimated from the past projects’ data and prior information would be the information collected from the experts (Delphi method).

Obtaining Sample Information

Regression method is used for obtaining the sample data. Productivity ranges of all the factors of COCOMO II are used for this calculation. Regression analysis is performed on the past projects’ value. Using normal probability density function on the projects’ data the values for the probabilities of the sample values is calculated. These values represent nothing but the new values for the productivity ranges for the factors in COCOMO II model.

Productivity Range = Highest Rating / Lowest Rating

(48)

Once the new productivity ranges are available, the next step is to collect the experts’ judgment. Delphi method is used for collecting the experts’ judgment. Using mean of the responses received from the experts and the deviation in the values new productivity ranges are defined. These values define the prior information part in the Bayesian analysis.

Combining The Prior And The Sample Information

The values obtained from the regression method and form the experts’ are used in the Bayes’ theorem. The output gives the probability that the value given by the expert is the accurate new value for the productivity range of the given factor.

2.10

Chapter Summary

The chapter has laconically discussed the project management, its purpose, the estimation, planning, tracking and calibration. The chapter also has represented the intricacies of function point analysis, COCOMO II model and the regression analysis and Bayesian method for calibration of COCOMO II in a simple way.

The next chapter contains the architecture of the tool and the design patterns used in the tool along with the packages created in the tool and their organization.

(49)

3

Application development

The chapter discusses the architecture of tool, design patterns used in the tool and the description of packages and the services provided by them.

3.1

Architecture

The proposed tool follows the Model-View-Controller (MVC) architecture. Figure 3.1 shows the Model-View-Controller Architecture. MVC is an architectural pattern used in software designing. With the development of large scale softwares the development has become more complex. The most volatile part in any software is user interface, because it is the face that is visible to the user directly. In a highly coupled design it is very difficult to make even the smallest change in the code. The need for the separation of business logic from the user interface is a major concern to any designer.

The MVC Pattern decouples the Model, View and Controller, and hence is a solution with the desired flexibility. It is often appropriate if one or more of the following statements are true [24]:

1.

Different representations of the same application data like table representation and graph representation are needed.

2.

Different Graphical User Interface (GUI) is needed perhaps for different environment (Different Operating systems) without affecting the rest of the application.

3.

Events generated by the user must immediately update application data or other components of the application, while the change in application data must be reflected to the user interface components immediately.

4.

Reusability of one or more GUI components is needed independent of application data.

(50)

Model View

Controller

Figure 3.1: Model-View-Controller Architecture

There are three main players in MVC architecture:

1.

Model: Represents the application data and functional logic in the form of a

component.

2.

View: The part visible to the user is known as view. A model can have more than

one view according to the choice of the users. For example data can be represented in the form of tables of can be represented as charts.

3.

Controller: View generates user events according to the action of the users.

Controller is responsible for the processing of events and for any action taken for that action. It may or may not result in manipulation of user interface.

Views rely on the Model to display and render information to users but they do not change the Model directly. When changes occur in the Model, Views are notified and may then query the Model for additional information. This provides Views with the opportunity to immediately synchronize themselves with changes in the Model [24].

Views and Controllers are loosely coupled with the Model via this change notification mechanism. Views and Controllers register themselves with the Model,

(51)

JSP _ClassesJava Faces Servlet

Faces-ConFigure.xml which in turn keeps an internal list of registered observers of changes. When changes to the Model occur, Views and Controllers are notified as necessary.

3.2

Java Server Faces

Java Server Faces (JSF) is a framework introduced by Sun Microsystems [24] which follows MVC architecture. In its model it keeps the java classes used in the software for the implementation of business logic. Figure 3.2 shows Java Server Faces Architecture. JSP pages are used as view in the framework. JSP pages use the models for rendering of data and for processing the business logic. JSP pages cannot directly perform operation on the model. They generate events and controller lies in between the model and view to listen these events and for sending the request to the appropriate class. A servlet named ‘Faces Servlet’ is used as the controller in the framework. The information about the navigations in the application and about the beans used in the application is kept loosely coupled from the