Human Resource Allocation in Software Project with Practical Considerations

(1)

Human Resource Allocation in Software Project

with Practical Considerations

Jihun Park*_{, Dongwon Seo}†_{, Gwangui Hong}‡_{, Donghwan Shin}§_, Jimin Hwa¶and Doo-Hwan Bae||

Department of Computer Science

Korea Advanced Institute of Science and Technology

Daejeon, 305–701, Korea *[email protected] †_{[email protected]} ‡_{[email protected]} §[email protected] ¶_{[email protected]} ||_{[email protected]}

Software planning is very important for the success of a software project. Even if the same developers work on the same project, the time span of the project and the quality of software may change based on the project plan. When software managers plan a software project, they strive to allocate human resources in a more e±cient way to produce a better software with less cost. The planning process is, however, time-consuming and complicated, especially when the size of the software project is large.

Many approaches have been proposed to help software project managers by providing optimal human resource allocations in terms of minimizing the cost. Previous approaches, however, only concentrated on minimizing the cost, and no existing works have considered the practical issues a®ecting project schedules in practice.

We elicited the practical considerations relating to the human resource allocation problem through discussions with a group of software project experts. The practical considerations can a®ect the project schedule in practice, but their importance has not been taken into consider-ation in previous approaches. Re°ecting the practical considerconsider-ations, we propose an approach for solving the human resource allocation problem using a genetic algorithm (GA).

We compare our approach to an approach that only considers minimization of the time span. Our evaluation shows that the proposed algorithm considers the practical considerations well, in terms of continuous allocation on relevant tasks, minimization of developer multitasking time, and balance of allocation. We also conducted a survey targeting software developers and managers, and the responses showed that practical considerations are as important as mini-mizing the cost, and our approach would be helpful to software managers. We also investigate the e®ect of weight factors and coe±cient between sub-scores, and ¯nd that it is di±cult to consider some practical considerations at the same time.

Keywords: Human resource allocation; genetic algorithm; software project. and Knowledge Engineering

Vol. 25, No. 1 (2015) 5–26

#

.

c World Scienti¯c Publishing Company

DOI:10.1142/S021819401540001X

5

Int. J. Soft. Eng. Knowl. Eng. 2015.25:5-26. Downloaded from www.worldscientific.com

(2)

1. Introduction

As the size of a software project increases, the software planning process becomes more complicated. The possible number of assignments is exponential to the number of developers and the number of tasks; software managers cannot inspect all can-didates when they plan a software project. In addition, making a good software project plan is important because an inappropriate software plan often results in low

quality software, or even results in failure of a project [15]. Since software planning is

a very important but time-consuming and di±cult process, software project man-agers can signi¯cantly bene¯t from the human resource allocation technique. The human resource allocation technique automatically allocates each developer to tasks to make an optimal project plan.

Many approaches have been proposed to deal with the human resource allocation

problem. For example, Alba et al. [1] and Chang et al. [6] suggested a genetic

al-gorithm (GA) approach for minimizing the project duration and project cost. Chang

et al. [7] suggested a ¯ne-grained representation of a human resource allocation result

and used a GA to ¯nd a schedule minimizing payment and delay penalty.

While these approaches only considered minimizing the cost in terms of time or money, they did not consider practical issues that can a®ect the actual development process when the plan is applied in the real world. We held discussions with a group of software experts to elicit practical issues that a®ect the project schedule in practice. We then suggested a GA approach re°ecting the practical considerations to maximize the development e±ciency.

Our evaluation shows that our approach considers the practical considerations well. The results from our approach show less multitasking time of developers, fewer assignments not considering task precedence, and more even allocations than the results from an approach only considering minimizing time span. While our previous

work [17] only conducted a quantitative evaluation of our approach, we conducted a

survey targeting software developers and managers, and correlation analysis between sub-scores. The survey showed that considering practical considerations is important in practice, and the results from our approach would be helpful for planning a software project. Our investigation on the e®ect of weights for each sub-score and the coe±cient between sub-scores shows that some practical considerations have nega-tive correlations; considering one issue may have a neganega-tive e®ect on the other.

The rest of this paper is organized as follows. Section 2 explains the practical

considerations for the human resource allocation problem and Sec. 3 describes our

genetic algorithm. Section4presents an evaluation of our approach, Sec.5discusses

threats to validity, Sec.6introduces related work, and Sec.7summarizes our ¯ndings.

2. Practical Considerations

By consulting with an expert group, we elicited practical issues that can a®ect the project schedule. The expert group consisted of managers and developers from a

(3)

software development and consulting company, military software experts from a research institute, and a professor and graduate students. The derived practical considerations are described below.

C1. Short project plan. The basic objective of the human resource allocation problem is to generate a project plan that can be ¯nished within the minimum time span, in order to reduce the entire project cost in terms of time and money. C2. Minimization of multitasking time. In practice, a developer can work on more than one task at the same time, whereas many studies have assumed that a

developer can work on only one task at a time [3, 9, 19]. We thus allow assigning

developers to work on multiple tasks at the same time. If a developer is involved in too many tasks at a certain moment, however, productivity will decrease since the developer cannot concentrate on one task due to the frequently switching tasks. In addition, a developer involved in multiple tasks at the same time is more likely to

introduce bugs [11].

C3. Assignment to relevant tasks. Since the task precedence relationship indi-cates closely related tasks, assigning a developer to both the pre-task and the post-task is e±cient in terms of minimizing the context-switching cost. If a developer should work on a series of tasks that are not related to each other, the developer must learn the new context for every assigned task, e.g. requirements or design of an unfamiliar module.

C4. Balance of allocation. Task size should be considered in the human resource allocation problem. On the one hand, if a few developers are assigned to a huge task, developers will be overwhelmed by the heavy workload. On the other hand, if too many developers are assigned to a small task, the high communication overhead causes ine±ciency. The sta® level (e.g. director/manager/engineer) of developers also should be considered. Developers with high sta® level have more experience than lower level developers, and as such they will manage a task rather than concentrating on the implementation, which is the main work of a low sta® level developer. To manage each task e±ciently, developers having di®erent sta® levels should be assigned together.

Previous project scheduling algorithms do not re°ect the ine±ciencies caused by these practical issues, but, in practice, project managers consider them very impor-tant. Our GA approach takes into account these practical considerations by repre-senting them as a part of the ¯tness function.

3. Human Resource Allocation with GA 3.1. Problem description

Our problem is described by the tasks and developers. A software project consists of

a set of tasks T ¼ ft1; t2; . . . ; tng. We use the task precedence graph (TPG) to

rep-resent the precedence relationship between tasks. Figure 1 shows an example of a

(4)

TPG. In Fig.1, task t1 must be completed before t2, t3, and t4 begin. Each task ti

(i¼ 1; 2; . . . ; n) is de¯ned by the following attributes.

. typei: The type of task. In our research, four task types that are the basic phases for

software process models are used: analysis, design, implementation, and testing.

. efforti: The e®ort estimated by the project manager, which is assumed to be

available as an input. The project manager can use existing e®ort estimation

techniques, such as COCOMO models [4, 5], or analogy-based software e®ort

estimation [16]. The unit of e®ort is man-hours.

. PTi: A set of preceding tasks for the task. A task cannot begin if any of the preceding

tasks is not completed. TPG is constructed based on the precedence relationship.

A set of developers D¼ fd1; d2; . . . ; dmg is allocated to tasks. We assume that the

developer information is managed by a database, and the project manager can access

it for planning a project. Developer dj (j¼ 1; 2; . . . ; m) is de¯ned by the following

attributes

. slj: The sta® level of the developer. In our problem, we assume that there are three

hierarchical sta® levels— director, manager, and engineer. Higher level developers

usually manage a task based on their experience or make a decision for a task, rather than implementing it directly themselves.

. abilityk

j: The ability of the developer for task type k. Developers have di®erent

levels of pro¯ciency for each task type. If a developer has 0.7 as the ability value on the design task type, the developer decreases 0.7 remaining man-hour per a time unit when he or she is only working on the task.

A single allocation for a task is represented by fti; fdk; . . . ; dlgg, and the

assign-ment result consists of the allocations for each task.

3.2. Genetic algorithm

A genetic algorithm (GA) is an evolutionary search-based heuristic algorithm

in-troduced by Holland [12]. Many studies tried to utilize the GA to ¯nd an optimal

solution for the human resource allocation problem, which has a very large search

Fig. 1. A Task Precedence Graph (TPG).

(5)

space [1,2,6,7]. We develop a GA approach re°ecting the practical considerations to

minimize ine±cient allocations that can delay the schedule in practice. Figure 2

shows the pseudo-code of our GA. Each step of the algorithm is explained in this section.

Representation. We de¯ne a chromosome to represent an assignment result as a

¯xed length of genes. The chromosome has jT j number of genes, where each gene

records the assigned developers for each task. For example, the ¯rst gene contains a

Fig. 2. Genetic algorithm.

(6)

set of developers who are assigned to task t1. Figure3 shows an example of a chro-mosome.

Initial population. The ¯rst step of our GA algorithm is to generate an initial population of chromosomes. The initial population comprises the populationSize number of chromosomes. For each chromosome of the initial population, we ran-domly assign a developer to a task, until every task has at least one assigned de-veloper.

Assessment. We evaluate each chromosome to identify a superior one among the population. The ¯tness function assesses a chromosome by simulating the solution and inspecting the assignment structure. Details of the ¯tness function are described

in Sec.3.3. A higher ¯tness score indicates a better solution, in terms of the practical

considerations described in Sec.2.

Selection. The selection step identi¯es superior chromosomes that should survive until the next generation and pass their genes down to the next generation. On the one hand, the elitism selection passes on the ¯ttest chromosome of the current generation to the next generation. This prevents degradation of the ¯ttest chromosome. On the other hand, the tournament selection chooses a chromosome to be a parent of the next generation. It randomly selects k (tournament size) chromosomes from the current

population, and chooses the ¯ttest one among them. In this study, k¼ 5 is used.

Crossover. The crossover step generates chromosomes for the next generation using two parent chromosomes selected by the tournament selection. We use uniform crossover, which generates a new chromosome by randomly taking each gene from the parents. We repeat the tournament selection and crossover steps to make new chromosomes until the number of chromosomes in the next generation becomes equal

to the prede¯ned population size. Figure4shows an example of the uniform crossover.

Fig. 3. An example of a chromosome.

Fig. 4. An example of the uniform crossover.

(7)

Mutation. The mutation step maintains genetic diversity. With a certain proba-bility of mutation (mutation rate), each gene is mutated using one of three mutation operators: assigning a random developer to the task, removing a random developer from the task, and replacing an assigned developer with a random developer. In this study, we use 0.05 for the mutation rate.

3.3. Fitness function

A ¯tness function assesses each chromosome to identify a superior one among the population. The ¯tness function encodes the requirements of our GA, which means better human resource allocation results in a higher ¯tness score. The formula for our ¯tness function is as follows.

FitnessScore¼ w₁ CMscore þ w₂ CEscore þ w₃ CCscore þ w₄ BAscore

Each wirepresents the weight for each sub-score, and each sub-score encodes our

practical considerations. CM (Cost Minimization) score assesses whether the project cost is minimized in terms of time, and CE (Concentration E±ciency) score repre-sents how many developers are involved in multitasking at each time unit. CM and CE scores are calculated with a scheduling simulation. CC (Concentration Consid-eration) score assesses whether developers are assigned to tasks that have precedence relationships, and BA (Balance of Allocation) score represents whether developers are evenly allocated to tasks considering the task size and the sta® level. The CM, CE, CC, and BA scores re°ect the practical consideration C1, C2, C3, and C4, respectively.

3.3.1. Scheduling simulation

We inspect the estimated time span of an assignment result and how assigned

developers perform each task, using a scheduling simulation algorithm (Fig.5). The

allocation result (variable solution) is an input of this algorithm. At the beginning of the algorithm, every task is added to remainingTasks and then the task that does not have any pre-task is added to enabledTasks. At the beginning of the while loop, the time tick increases. The start time and ¯nish time of each task are calculated with this time tick. The ¯nish time of the last task is regarded as the total time span of a project. For each task in enabledTasks, developers are assigned as recorded in solu-tion, and the task is moved to runningTasks. The remaining man-hour of each task in runningTasks is decreased by the sum of the assigned developers' ability. The ability of the developer is assumed to exist as an input. If a developer is working on more than one task, the developer's ability for each task is divided by the number of assigned tasks, because his or her capability is divided among the tasks. After de-creasing the remaining man-hours of running tasks, running tasks with remaining man-hours less than or equal to zero are moved to completedTasks. Tasks in remai-ningTasks for which every pre-task is completed are moved to enabledTasks.

(8)

3.3.2. Fitness scores

Cost Minimization (CM) Score. The CM score assesses whether the solution ¯nishes early. (C1) The formula for CM score is as follows.

CM score¼minTSðSÞ TSðSÞ minTSðSÞ ¼ X k2types X fti2T jtypei¼kgefforti maxðabilityk 1; . . . ; abilitykmÞ jDj :

The minTSðSÞ refers to the minimum time span, and TSðSÞ refers to the time

span of the solution S which can be calculated by the simulation algorithm. The minTSðSÞ is calculated with the assumption that every developer has the highest ability among developers at the given task type and there is no multitasking over-head or communication cost. The CM score becomes higher if the time span of the solution becomes lower.

Fig. 5. Scheduling simulation algorithm.

(9)

Concentration E±ciency (CE) Score. To assess the burden of multitasking (C2), we calculate the number of tasks that a developer has to deal with at each time unit. At a certain moment, a developer might work on more than one task because the assigned tasks are performed at the same time; such multitasking, however, can cause ine±ciency in practice, which delays the project progress relative to the expected schedule. The formula for the CE score is as follows.

CE score¼ 1 jDj

X

dj2D jf8 tjj#involveðdj; tÞj 1gj

P

finishTime t¼1 j#involveðdj; tÞj :

j#involveðdj; tÞj refers to the number of tasks that developer dj is involved in at

time t. For each developer, we divide the total working time units by the sum of the number of involved tasks at those time units to obtain the partial CE score. For example, if a developer performs [2, 1, 0, 4] tasks on time units 1 to 4, the developer

level CE score is calculated by 3=2 þ 1 þ 4. We average all the partial scores for each

developer to obtain the CE score. The CE score comes to 1.0 if every developer performs only one task at every time unit that he or she works.

Continuity Consideration (CC) Score. The CC score assesses how well the human resource allocation result considers the precedence relationship between tasks

(C3). We count the number of unit allocations uax¼ ftx; dxg that do not consider

continuity, in which the developer is not assigned any pre-tasks of task tx_{. The}

solution of human resource allocation S0 is represented by a set of unit allocations

(S0¼ fua1; ua2; . . . ; uang), and the CC score is calculated with the formula below.

CC score¼jfuax2 S

0_jð9ðua

kÞ 2 S0Þ ^ ðtk 2 PTxÞ ^ ðdk ¼ dxÞgj

jS0_j :

The numerator of the formula counts how many times the developer of a unit

allocation (dx_{) is assigned to any pre-task of the task (PT}x_{). The CC score comes to}

1.0 if every unit assignment considers continuity.

Balance of Allocation (BA) Score. We assess how evenly the developers are allocated to tasks considering the task size and sta® level using the BA score (C4). The

BA score uses Shannon entropy [18] which assesses the uncertainty in a distribution.

The normalized Shannon entropy is de¯ned as: H ¼ Pni¼1ðpi lognpiÞ, where piis

the probability that each element occurs (pi 0 and

Pn

i¼1pi¼ 1). The value is

maximized when all pis have the same probability, i.e. pi¼ 1=n; 8 i 2 1; 2; . . . ; n. We

calculate the BA score with the formula below.

BA score¼

X

s2staff levelEntropys jfstaff levelgj Entropys¼ X n i¼1 ðps i lognpsiÞ

(10)

ps_i ¼ #assigneds i efforti X tk2T #assigneds k effort_k :

Entropys _{represents the entropy value at each sta® level, and p}s

i represents the

normalized value for the number of assigned developers of sta® level s at task ti

(#assigneds_i) divided by the e®ort of the task (efforti). The average of the entropy

values for each sta® level is calculated to assess the BA score. The BA score becomes

1.0 if all piare the same, meaning that the number of assigned developers to a task is

proportional to the e®ort of the task.

4. Evaluation

We evaluate our GA in three ways. First, we quantitatively evaluate our approach through a comparison to the assignment results considering minimization of time span only. Second, we survey a group of software developers and managers who have experience in management of a software project. Third, we investigate the e®ects of the weight factors, and the correlation between each sub-score.

4.1. Quantitative evaluation

We assess how well our GA re°ects the practical considerations, by comparing the

result when the GA only considers cost minimization (Casetime, weights for the

¯tness score ¼ f1; 0; 0; 0g), and the result when considering all the objectives

(Caseall, weights for the ¯tness score =f0:25; 0:25; 0:25; 0:25g).

4.1.1. Experimental setup

We generate two experiment sets, Set 1 and Set 2. Set 1 and Set 2 consist of 11 tasks/5

developers and 21 tasks/7 developers respectively. Tables1and2describe a task set

Table 1. Task set T1.

ID typei efforti PTi t1 Analysis 400 t2 Design 320 1 t3 Design 240 1 t4 Design 240 1 t5 Implementation 240 2 t6 Implementation 600 3 t7 Implementation 160 4 t8 Test 100 5 t9 Test 80 6 t10 Test 70 7 t11 Test 80 8,9,10

(11)

T1with 11 tasks and a task set T2with 21 tasks. Tables3and4describe a developer

set D₁ with 5 developers and a developer set D₂ with 7 developers.

Our GA uses 100 population sizes, and we set the maximum generation count to 400. We inspect the tendency of the ¯tness value along di®erent GA parameters, and then determine the appropriate parameters under which the GA explores a su±cient search space to obtain the optimal solution.

We compare (1) the time span of the results, (2) the average time units that each developer works on more than one task, (3) the number of assignments not consid-ering the task precedence, (4) the number of tasks for which only one level developers are assigned, and (5) the mean and variance of (the number of assigned developers/

efforti) for each task. We repeat each experiment 100 times to compare the average

value. Table5summarizes the results.

Table 2. Task set T2.

ID typei efforti PTi t1 Analysis 640 t2 Design 240 1 t3 Design 480 1 t4 Design 560 1 t5 Design 400 2 t6 Design 320 2 t7 Design 240 3 t8 Implementation 400 4 t9 Implementation 400 4 t10 Implementation 480 4 t11 Implementation 480 5 t12 Implementation 400 6 t13 Implementation 400 7 t14 Implementation 240 7 t15 Implementation 320 8,9,10 t16 Implementation 320 11,12 t17 Implementation 160 13,14 t18 Test 240 16 t19 Test 240 17 t20 Test 240 15 t21 Test 400 18,19,20

Table 3. Developer set D1.

abilityk

j

ID slj analysis design implementation test

d1 3 1.25 1 1.25 1.25

d2 2 1.25 1 1.25 0.75

d3 1 0.75 0.75 1 1

d4 1 0.75 1 1 0.75

d5 1 1 0.75 0.75 1

(12)

4.1.2. Experimental results

The time span of the results. We ¯nd that Casetime takes a 6.44% and 13.56%

shorter time span than Caseallin Set 1 and Set 2, respectively. We simulate the time

span of each solution, but in practice, several issues can a®ect the actual time span, such as multitasking overhead, context switching cost, and hierarchical management e±ciency. Considering that we minimize these practical issues that can delay the project schedule, we conclude that the di®erence is acceptable.

The average multitasking time. We investigate how long developers are involved in multitasking, by studying the average time during which a developer works on more than one task. We ¯nd that the average multitasking time of each developer in

Casetime is 2.77 and 5.00 times that of Caseallin Set 1 and Set 2, respectively. High

multitasking time represents that developers should work on more than one task at

the same time for a longer time in Casetimeschedule than in Caseall.

Assignments not considering precedence relationship. We assess whether the human resource allocation result reduces the ine±ciency of context switching, by comparing the number of unit assignments (see Sec. 3.3.3) that do not consider the task precedence relationship. We ¯nd that the number of assignments not

consid-ering precedence relationship in Casetime is 1.91 and 2.11 times that of Caseall in

Set 1 and Set 2, respectively. This result indicates that developers are assigned to a

Table 5. Experiment results.

Set1 Set2

#task ¼ 11 dev ¼ 5 #task ¼ 21 dev ¼ 7

Metric All Time All Time

Time (h) 465.14 435.19 1183.95 1023.42

Multitasking Time (h) 55.16 152.86 115.94 579.46

# no precedence assignments 5.70 10.89 10.78 22.73

# tasks only one level assigned 2.91 2.65 5.82 2.87

Mean (# assigned devs/e®orti) 1.81e-02 2.52e-02 8.51e-03 1.36e-02

Variance (# assigned devs/e®orti) 1.50e-04 3.63e-04 2.84e-05 6.06e-05

Table 4. Developer set D2.

abilityk

j

ID slj analysis design implementation test

d1 3 1.25 1.25 1.25 1 d2 2 1 1.25 0.75 1.25 d3 2 1.25 0.75 1.25 1 d4 1 1 0.75 1 0.75 d5 1 0.75 0.75 1.25 1 d6 1 0.75 1 0.75 0.75 d7 1 1.25 0.75 1 0.75

(13)

task having a di®erent context from the previous task more often in Casetimethan in

Caseall.

The number of tasks for which only one sta® level of developers are allocated. To investigate whether developers having di®erent sta® levels are evenly assigned to each task, we compare the number of tasks for which only one sta® level

of developers is assigned in Casetimeand Caseall. We ¯nd that the number is higher in

Casetime than in Caseall. Because the number of high level developer is limited and

our GA minimize multitasking time in Case_all, there is not a signi¯cant di®erence in

the number between Casealland Casetime for Set 1. We ¯nd that the average

mul-titasking time and the number of tasks with only one sta® level developers are

inversely proportional. This result shows that our GA in Caseall considers the sta®

level slightly better than Casetime, but minimizes multitasking time in Caseall

sig-ni¯cantly.

The mean and variance of (the number of assigned developers/e®ort_i) for

each task. To identify how many developers are assigned to each task proportional to the e®ort, and how stable the value is, we investigate the mean and variance of #

of assigned developers/efforti. Overall, the means and variances are higher in Cas

etimethan in Caseall. This result indicates that more developers are assigned when we

only consider time, which can increase communication overhead and multitasking time. Moreover, large variances indicate there are more cases where too many developers are assigned to a small task or a small number of developers are assigned to a large task.

4.2. User study

We performed a user study to evaluate how our GA is useful for the software developers in practice. The user study consisted of three research questions: (1) What do software managers consider when they plan a software project? (2) Is the allo-cation result from our GA better than the alloallo-cation results from software managers? (3) Is the result from our algorithm useful for software managers?

We conducted a survey targeting software developers in a research institute, and obtained 16 responses. The respondents had 4.54 years' experience in software development, and management experience in 3.94 software projects on average.

4.2.1. What do software managers consider when they plan a software project? We asked what kind of issues the respondents considered when they plan a software

project. The possible choices were the four practical considerations in Sec.2, and the

respondents selected one or more issues as their answer (The respondents also could

select the option others). As Table 6 shows, 56.3% consider minimization of time,

18.8% consider minimization of multitasking time, 75.0% consider continuity, 37.5%

(14)

consider evenly assigning developers, and 6.4% considers other factors that we did not provide as an option.

The respondents believed that assigning developers to a set of relevant tasks was very important. This result indicates that the context switching cost, which involves learning and adapting to a di®erent context of a task, is very high in practice. This result is remarkable, because this continuity issue is not addressed by existing human resource allocation approaches. In addition, some respondents thought that assign-ing developers to relevant tasks was more important than minimizassign-ing the time span. This result indicates that some other factors, such as software quality or managing developers' experience, which are considered if we assign developers on relevant tasks, can a®ect the success of a software project.

4.2.2. Is the allocation result from our GA better than the allocation results from the software managers?

We asked the respondents to manually assign developers to each task in experiment Set 1 (11 tasks and 5 developers) and experiment Set 2 (21 tasks and 7 developers)

(see Sec. 4.1). We simulated the results of the respondents with the simulation

algorithm in Sec. 3.3.1, and then compared them with the results in Sec. 4.1.2

(average of 100 results of our GA).

Table7shows the results. We ¯nd that the results of our GA take less time, and

they also have less multitasking time, fewer assignments not considering precedence

Table 6. What kind of issues the software developers consider when they plan a software

project in practice.

What kind of issues do you consider when planning a software project?

Minimizing the time span of a project 56.3%

Minimizing multitasking time of each developer 18.8%

Assigning developers to a set of relevant tasks 75.0%

Assigning developers evenly considering sta® level and the e®ort of each task 37.5%

Others 6.3%

Table 7. The results of the respondents versus the results of our GA.

Set1 Set2

#task ¼ 11 dev ¼ 5 #task ¼ 21 dev ¼ 7

Metric Response Our GA Response Our GA

Time (h) 613.94 465.14 1777.07 1183.95

Multitasking Time (h) 123.4 55.16 392.06 115.94

# no precedence assignments 8.44 5.70 15.60 10.78

# tasks only one level assigned 3.25 2.91 7.67 5.82

Mean (# assigned devs/e®orti) 1.77e-02 1.81e-02 6.69e-03 8.51e-03

Variance (# assigned devs/e®orti) 1.50e-04 1.50e-04 9.27e-06 2.84e-05

(15)

relationship, and a smaller number of tasks where only one sta® level of developers is

assigned. The mean and variance of # assigned devs/e®orti are similar in the

ex-periment Set 1, and the results from the respondents show a lower number in the experiment Set 2.

These results indicate that considering the practical considerations when planning a software project manually is di±cult. The results automatically generated from our GA not only took less time but also considered the other practical considerations better. In addition, our algorithm can shorten the time to make a software project plan. Our algorithm takes only 0.5 and 2.7 minutes on average, while manual as-signment takes much more time, 8.0 and 10.9 minutes, in experiment Set 1 and Set 2, respectively.

4.2.3. Is our algorithm useful for the software managers?

After asking manually assigning developers, we showed assignment results generated from our GA to the respondents to qualitatively evaluate the assignment results

generated from our GA. The assignment results are shown in Fig.6(for experiment

Set 1) and Fig.7(for experiment Set 2). We showed assigned developers to each task

on a task precedence graph, and whether each developer was assigned to a pre-task of

Fig. 6. The assignment results for experiment Set 1.

(16)

the task. We also showed the simulated time span of the assignment results, and the multitasking time of each developer to the respondents.

We asked whether our GA re°ected the practical considerations well. We ask four questions for the two experiment set: Does the result consider well (1) minimization of project plan? (2) minimization of multitasking time? (3) assigning developers to relevant tasks? (4) balance of allocation? Possible choices range from strongly agree (5 point) to strongly disagree (1 point).

Table 8 shows the results. On average, except for the minimization of

multi-tasking time (3.1 and 3.5 in Set 1 and Set 2, respectively), the respondents agree that our GA considers the practical considerations well (3.8 to 4.4). We think that minimizing multitasking time of each developer, speci¯cally in a small size project, is

Table 8. Qualitative evaluation results.

Question Set1 Set2

Minimizing the time span of a project 3.8 4.0

Minimizing multitasking time of each developer 3.1 3.5

Assigning developers to a set of relevant tasks 4.4 3.9

Assigning developers evenly considering sta® level and the e®ort of each task 4.2 4.1

Fig. 7. The assignment results for experiment Set 2.

(17)

di±cult, because there are only one director and one manager, who should manage many tasks at the same time. The continuity consideration, which is the most im-portant practical consideration, shows 4.4 and 3.9 points on average, meaning that the respondents agree that our algorithm considers the continuity issue well.

4.3. Correlation analysis

Our GA uses the same weight (0.25) for the all sub-scores. Software managers, however, can change the weight when a speci¯c factor is more important than others. For example, if knowledge of a preceding task is very important to perform the post-task in a speci¯c project, the project manager might set a high weight on the Con-tinuity Consideration (CC) score.

We investigate how four sub-scores are changed when weights for each sub-score are changed in experiment Set 2. A higher sub-score means that the corresponding practical consideration is considered more. We change a weight for a sub-score from 0.1 to 0.9, and then evenly distribute the remaining weight value to the other sub-scores. For example, if we set the weight for the CM score to 0.1, then the other weights are set to 0.3. We average 50 experiment results for each setting of weight factors.

Figure8shows the experiment results. The X axis represents the weight factor for

a target sub-score, and the Y axis represents the sub-scores. Overall, a sub-score

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.5 0.6 0.7 0.8 0.9 1.0 Sub-score Weight of CM Score 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.5 0.6 0.7 0.8 0.9 1.0 Sub-score Weight of CC Score CM Score BA Score CC Score CE Score 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.5 0.6 0.7 0.8 0.9 1.0 Sub-score Weight of CE Score 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.5 0.6 0.7 0.8 0.9 1.0 Sub-score Weight of BA Score CM Score BA Score CC Score CE Score

Fig. 8. Sub-score values for di®erent weight factors.

(18)

increases as the weight for the sub-score increases, and the other sub-scores decrease as the weight for the sub-scores decrease. For example, as the weight for the CE score increases from 0.1 to 0.4, the CE score increases from 0.64 to 0.99 and the BA score decreases from 0.95 to 0.82. A higher weight for a sub-score means that the sub-score a®ects the total ¯tness score more, and this results in a higher value of the sub-score.

We also ¯nd that some sub-scores do not decrease even though the weight for the sub-score decreases and the weight for the other sub-score increases. As an example, when the weight for the CM score increases, the BA score does not decrease even when the weight for the BA score decreases. This result means that increment of the CM score does not require sacri¯ce of the BA score.

To investigate the relationship between each sub-score, we identify the correlation between the four sub-scores. We generate 100 assignment results from our GA with even weight factors (0.25 for all), and then calculate the Pearson correlation coe±-cient between each pair of sub-scores. Our GA maximizes the total ¯tness value disregarding the proportion of each sub-score. A positive correlation between two sub-score indicates that if one sub-score has a high value, the other is likely to have a high value. A negative correlation between two sub-score indicates that if a sub-score increases, then the other sub-score will decrease.

Table 9shows the results. If the absolute value of the Pearson correlation

coef-¯cient is less than 0.3, we can surmise that there is a negligible relationship. The CM score and CE score show a weak negative relationship (0.375), and the BA score and CE score show a rather strong negative relationship (0.672). These results mean that if we evenly distribute developers to each task, the average multitasking time increases. Because the number of high sta® level developers is limited, if we evenly assign them to the tasks, they are likely to work on more than one task at the same time. The results also indicate that some multitasking is needed to shorten the time span of a project.

5. Discussion

We use the task type to represent a characteristic of a task, rather than de¯ning a

skill set needed to perform a task. Many previous approaches [7,8,19] used skill sets

to represent the required capability required for a task, but we abstract the skill sets

Table 9. Pearson correlation coe±cient between sub-scores.

Sub-score pair Pearson correlation coe±cient

CM score and BA score 0.097

CM score and CC score 0.087

CM score and CE score 0.375

BA score and CC score 0.108

BA score and CE score 0.672

CC score and CE score 0.159

(19)

using the task type. Our assumption is that de¯ning and inputting required skill set for each task and the skill level of each developer is di±cult for project managers, especially when the number of skills is large. A project manager can use our tool by categorizing tasks into several types, and de¯ning the pro¯ciency of developers for each type, which is simpler than a skill set representation.

We generated task and developer information for our experiments (Experiment Set 1 and Set 2). Because we do not apply our algorithm to the real-world software project data, it can be a threat to validity. Our user study, however, can compensate for this threat, because real-world software developers and managers agreed that our algorithm would be useful for them.

Our approach generates only one solution, the ¯ttest one, not considering the

relative proportion of each sub-score. We can apply the MOEA approach [13], which

generates a set of non-dominated solutions when the ¯tness function comprises the weighted sum of multiple sub-scores.

Our approach is not domain speci¯c, and it can be applied to general software projects; however, survey results may be di®erent when applied to di®erent respondents. Because we conducted the user study in a research institute, software developers or managers from di®erent company or institute would consider di®erent issues as important.

As a practical issue that is derived from the discussion with a group of software experts, spare time is sometimes needed in carrying out a task or a project. We assume that a task can start right after the pre-tasks of the task are ¯nished, but software managers sometimes plan spare time after ¯nishing some tasks. Respon-dents said that unexpected events, such as change of requirements, a sick leave of a developer, or restructuring of the company, often occurred in practice. A respondent also said that 20% to 30% of spare time can improve the software quality, because developers can have enough time to prepare a task (e.g. ¯nding a new tool or methodology). The spare time can be considered in our approach by setting the e®ort of each task larger than the expected value.

6. Related Work

Duggan et al. [9] suggested a GA approach as a multi-objective evolutionary

ap-proach to consider package precedence, full team utilization, and cross-communi-cation overhead. They only considered that the successor package should be developed after the predecessor, not considering that the developers should be assigned to tasks with a precedence relationship to lessen the context switching cost.

Barreto et al. [3] suggested an optimization-based approach to ¯nd teams satisfying

constraints established by the software development organization. These work as-sumed that developers can work on only one task at a time, which is unrealistic in practice. Their approaches require ¯ne-grained project plan in which the task size is small enough to be ¯nished by a developer.

(20)

Chang et al. [7] proposed a human resource allocation technique based on a GA with the concept of a time-line. The assignment is represented by a three-dimensional array with time, tasks, and employee axes. As time elapses with the pre-de¯ned time unit, the human resources are allocated to tasks. A genetic algorithm is used to

search the schedule that minimizes payment and delay penalties. Chen et al. [8]

improved Chang et al.'s work [7] by introducing an event-based scheduler and an ant

colony optimization technique. These approaches could allocate human resources in a ¯ne-grained manner, but the allocation was often too fragmented. In addition, they only concentrated on minimizing the cost, and did not take into account the practical issues that are considered in this paper.

Gerasimou et al. [10] investigated the human resource allocation problem using a

particle swarm optimization technique. Their ¯tness function evaluated whether each task was ¯nished before the deadline and whether an appropriate developer was assigned to a task considering the ability of the developer. Their approach allowed developers to work on more than one task at the same time, but did not limit or manage the number of tasks on which a developer can work.

Kang et al. [14] proposed a constraint-based approach considering constraints

that a®ected the project schedule. Their constraints included continuous allocation to related tasks, minimization of sharing developers among tasks, restriction of the number of developers assigned to a task, and avoiding novice teams. These con-straints encoded some of the practical considerations of the present work, but they assumed that each program module can always be developed in parallel, without considering the precedence between modules and the integration of the modules.

Alba et al. [1] provided a GA approach for the human resource allocation problem

to minimize the time span and the cost of a project. The time span was calculated by summing up the working time of all developers for each task. The cost for each developer was calculated by multiplying the salary and the total working time of the developer, and the cost of the project was calculated by summing up the cost for each developer. Alba et al. considered the required skill set for each task, but they did not consider the ability level of each developer. In addition, they did not consider the task size; this could result in assigning too many developers or too few developers to a task.

7. Conclusion

Many approaches have been proposed to ¯nd optimal human resource allocations. The majority have only considered minimizing the expected cost of the plan, and not practical issues that can a®ect the actual schedule in practice. With a group of software experts, we elicited the practical considerations for the human resource allocation problem. We then designed a genetic algorithm to re°ect the practical issues, by encoding them as a part of a ¯tness function. The ¯tness function consists of the weighted sum of four ¯tness scores considering cost minimization, concen-tration e±ciency, continuity consideration, and balance of allocation.

(21)

Our quantitative evaluation shows that when the four ¯tness scores are consid-ered altogether, our GA generates a practical human resource allocation result, in terms of less multitasking time, fewer assignments not considering task precedence, and more even allocations than an algorithm where the sole objective is to minimize the time span. A survey of software developers and managers also shows that practical considerations are as important as minimizing the cost of a project, and the results of our GA will be very useful to software managers. Furthermore, we reveal the e®ects of weight factors and the correlation relationships between sub-scores.

Acknowledgments

This work was supported by the IT R&D program of MSIP/KEIT [10041313, UX-oriented Mobile SW Platform].

References

1. E. Alba and J. Francisco Chicano, Software project management with GAs, Information Sciences 177(11) (2007) 2380–2401.

2. G. Antoniol, M. Di Penta and M. Harman, Search-based techniques applied to optimi-zation of project planning for a massive maintenance project, in Proceedings of the 21st IEEE International Conference on Software Maintenance, 2005, pp. 240–249.

3. A. Barreto, M. Barros and C. Werner, Sta±ng a software project: A constraint satis-faction approach, SIGSOFT Softw. Eng. Notes 30(4) (2005) 1–5.

4. B. W. Boehm, Software Engineering Economics (1981).

5. B. W. Boehm, R. Madachy, B. Steece et al., Software Cost Estimation with COCOMO II with CD-ROM (Prentice Hall PTR, 2000).

6. C. K. Chang, M. J. Christensen and T. Zhang, Genetic algorithms for project manage-ment, Ann. Softw. Eng. 11(1) (2001) 107–139.

7. C. K. Chang, H. Yi Jiang, Y. Di, D. Zhu and Y. Ge, Time-line based model for software project scheduling with genetic algorithms, Information and Software Technology 50(11) (2008) 1142–1154.

8. W.-N. Chen and J. Zhang, Ant colony optimization for software project scheduling and sta±ng with an event-based scheduler, IEEE Transactions on Software Engineering 39(1) (2013) 1–17.

9. J. Duggan, J. Byrne and G. J. Lyons, A task allocation optimizer for software con-struction, IEEE Softw. 21(3) (2004) 76–82.

10. S. Gerasimou, C. Stylianou and A. S. Andreou, An investigation of optimal project scheduling and team sta±ng in software development using particle swarm optimization, in ICEIS, 2012, pp. 168–171.

11. A. E. Hassan, Predicting faults using the complexity of code changes, in Proceedings of the 31st International Conference on Software Engineering, 2009, pp. 78–88.

12. J. H. Holland, Adaptation in Natural and Arti¯cial Systems: An Introductory Analysis with Applications to Biology, Control, and Arti¯cial Intelligence (Univ. Michigan Press, 1975).

13. H. Ishibuchi and T. Murata, A multi-objective genetic local search algorithm and its application to °owshop scheduling, IEEE Transactions on Systems, Man, and Cyber-netics, Part C: Applications and Reviews 28(3) (1998) 392–403.

(22)

14. D. Kang, J. Jung and D.-H. Bae, Constraint-based human resource allocation in software projects, Software: Practice and Experience 41(5) (2011) 551–577.

15. H. R. Kerzner, Project Management-Best Practices: Achieving Global Excellence (John Wiley & Sons, 2014).

16. J. Li, G. Ruhe, A. Al-Emran and M. M. Richter, A °exible method for software e®ort estimation by analogy, Empirical Software Engineering 12(1) (2007) 65–106.

17. J. Park, D. Seo, G. Hong, D. Shin, J. Hwa and D.-H. Bae, Practical human resource allocation in software projects using genetic algorithm, in Proceedings of the Twenty-Sixth International Conference on Software Engineering & Knowledge Engineering, 2014, pp. 688–694.

18. C. E. Shannon, A mathematical theory of communication, ACM SIGMOBILE Mobile Computing and Communications Review 5(1) (2001) 3–55.

19. V. Yannibelli and A. Amandi, A knowledge-based evolutionary assistant to software development project scheduling, Expert Systems with Applications 38(7) (2011) 8403– 8413.