• No results found

Algorithm design: PICEA-g

3.2 PICEA-g

3.2.1 Algorithm design: PICEA-g

Harnessing the benefits of co-evolution for optimisation purposes is known to be chal- lenging (Bongard and Lipson, 2005; Kleeman and Lamont, 2006), although there are multi-objective examples, and we are aware of only one existing work that has attempted to implement a concept similar to PICEA. Lohn et al. (2002) considered co-evolving a family of target vectors as a means of improving diversity across the Pareto front. The

paper was published shortly before the advent of many-objective optimisation in EMO and the authors did not consider the benefits of the target vectors for improving solution comparability per se. However, the paper can certainly be interpreted in such terms. The fitness assignment of Lohn et al.’s method is very interesting and we retain this in our study of the first realisation of a preference-inspired co-evolutionary algorithm: PICEA-g. PICEA-g considers a family of goals – a more natural terminology than tar- get vectors when thinking about decision-maker preferences, but the two are essentially equivalent.

According to the fitness assignment in Lohn et al. (2002), candidate solutions gain fitness by dominating a particular set of goal vectors in objective-space, but the fitness contribution from goal vectors is shared between other solutions that also dominate those goals. In order to gain high fitness, candidate solutions need to dominate as many valuable goal vectors as possible. Valuable goal vectors refer to those which are dominated by few candidate solutions. To do so, candidate solutions must move towards the Pareto optimal front, either being closer to the Pareto optimal front (convergence) or spread in regions where few solutions exist (diversity). Goal vectors only gain fitness by being dominated by a candidate solution, but the fitness is reduced the more times the goal vectors are dominated by other candidate solutions in the population. Therefore, in order to gain high fitness, goal vectors should be placed in regions where only few solutions can dominate it, that is, either being closer to the Pareto optimal front or spread in a sparser region. The overall aim is for the goal vectors to adaptively guide the candidate solutions towards the Pareto optimal front. That is, the candidate solution population and the goal vectors co-evolve towards the Pareto optimal front.

We implement PICEA-g within a (µ+λ) elitist framework1 shown as Figure 3.1. A population of candidate solutions and preference sets (goal vectors), S and G, of fixed size, N and Ng, are evolved for a number of generations. In each generation t, parents

S(t) are subjected to (representation-appropriate) genetic variation operators to produce N offspring,Sc(t). Simultaneously,Ng new goal vectors,Gc(t), are randomly generated.

S(t) and Sc(t), and G(t) and Gc(t), are then pooled respectively and the combined populations are sorted according to their fitness. Truncation selection is applied to select the bestN solutions as a new parent population S(t+ 1) andNg goal vectors as

a new preference populationG(t+ 1).

The method to calculate the fitness,F its, of a candidate solution s and fitness, F itg of

a goal vectorg is defined by Equations 3.1,3.2and 3.3:

F its= 0 + X g∈GU GC|sg 1 ng (3.1) 1

New parents (of sizeµ) are selected from a combined set of parents (of sizeµ) and offspring (of size

Figure 3.1: A (µ+λ) elitist framework of PICEA-g.

whereng is the number of solutions that satisfy (i.e., dominate) goal vectorg (note that

ifs does not satisfy anyg then theF its of sis defined as 0) and

F itg = 1 1 +γ (3.2) where γ =    1 ng= 0 ng−1 2N−1 otherwise. (3.3)

In order to further explain the fitness assignment scheme, consider the bi-objective minimisation instance, shown in Figure 3.2, with two candidate solutions s1 and s3, their offspring s2 and s4, two existing preferences g1 and g3, and two new preferences

g2 and g4 (i.e. N =Ng = 2).

In Figure 3.2, g1 and g2 are each satisfied by s1, s2, s3 and s4 and so ng1 = ng2 = 4.

g3 and g4 are satisfied by s3 and s4 only and therefore ng3 = ng4 = 2. In terms of

fitness of solutions, from Equation 3.1, F its1 = F its2 = 1 ng1 + 1 ng2 = 1 4 + 1 4 = 1 2 and F its3 = F its4 = 1 ng1 + 1 ng2 + 1 ng3 + 1 ng4 = 1 4 + 1 4 + 1 2 + 1 2 = 3

2. Considering the goal vector fitnesses, using Equation 3.2, γ forg1 and g2 is

ng1−1

2N−1 =

4−1

4−1 = 1 and so, using Equation 3.3, F itg1 = F itg2 =

1

2. Similarly, γ for g3 and g4 is 2−1 4−1 = 1 3 and therefore F itg3 =F itg4 = 3 4.

Based on the fitness,s3ands4are considered as the best solutions, which will be selected into the next generation. However, obviously, s3 is dominated by s4. Compared with

s3, although s2 has a lower fitness, it is non-dominated with s4. Therefore, s2 and s4 are desired to be kept in the population set. In order to do that, the classic Pareto- dominance relation is incorporated. After calculating fitness values using Equations3.1, 3.2 and 3.3 we next identify all the non-dominated solutions in the set SU

Sc. If the number of non-dominated solutions does not exceed the population size, then we assign the maximum fitness to all of the non-dominated solutions. However, if more than N non-dominated solutions are found, we then disregard the dominated solutions prior to applying truncation selection (implicitly, their fitness is set to zero). Based on fitness, the bestN non-dominated solutions are selected to constitute the new parent S(t+ 1). In the example in Figure3.2,F its1 = 0,F its2 =

1

2,F its3 = 0 and F its4 = 3 2.

The pseudo-code of PICEA-g is presented in Algorithm1overleaf. Both the convergence and diversity are taken into account by the fitness assignment. In the following we explain the main steps of PICEA-g.

• Line 1 initialises the offline archiveBestF as∅.

• In lines 2 and 3, N candidate solutions S are initialised and their objective values F S are calculated. The offline archive BestF is updated by function updateArchivein line 4.

• Line 5 applies function goalBound to determine goal vector bounds GBounds for the generation of goal vectors. Line 6 generates Ng goal vectors by function

goalGenerator.

• The offspring candidate solutionsScare generated by functiongeneticOperation in line 8, and their objective valuesF Sc are calculated in line 9. S and Sc,F S and F Sc are pooled together, respectively in line 10. Line 11 generates another set of goal vectorsGcbased on the determinedGBounds, andGandGcare pooled together in line 12.

• Line 13 applies function fitnessAssignment to calculate the fitness of the com- bined solutions J ointS and goal vectorsJ ointG.

Algorithm 1:Preference-inspired co-evolutionary algorithm using goals (PICEA-g)

Input: Initial candidate solutions, S of size N, initial goal vectors, Gof size Ng,

maximum number of generations,maxGen, the number of objectives, M, a scaling parameter,α, archive size ASize

Output: S, G, offline archive, BestF

1 BestF ← ∅;

2 S ←initializeS(N) ;

3 F S← objectiveFunction(S);

4 BestF ← updateArchive(BestF, F S, ASize);

5 GBounds← goalBound(F S, α);

6 G← goalGenerator(Ng, GBounds) ;

7 fort←1 tomaxGendo

8 Sc← geneticOperation(S); 9 F Sc← objectiveFunction(Sc);

10 (J ointS, J ointF)← multisetUnion(S, Sc, F S, F Sc);

11 Gc←goalGenerator(Ng, GBounds) ;

12 J ointG←multisetUnion(G, Gc);

13 (F itJ ointS, F itJ ointG)← fitnessAssignment(J ointS, J ointG);

14 find the index, ixN omof all the non-dominated solutions from J ointF and count

the number of non-dominated solutions,numN omF;

15 if numN omF < N then

16 F itJ ointS(ixN om)← maxFitness(F itJ ointS);

17 (S, F S)←truncation(J ointS, F itJ ointS, J ointF, N);

18 else

19 (S, F S)←

truncation(J ointS(ixN om), J ointF(ixN om), F itJ ointS(ixN om), N);

20 end

21 G←truncation(J ointG, F itJ ointG, N goal);

22 BestF ← updateArchive(BestF, F S, ASize);

23 GBounds← goalBound(BestF, α);

24 end

• Lines 14 to 21 select the best N candidate solutions and Ng goal vectors as new

parents according to their fitness.

• Line 22 updates the offline archive with the F S by functionupdateArchive. • Line 23 updates goal vector boundsGBoundsbased on the offline archive solutions.

Other basic functions

(i) Function geneticOperation applies genetic operators to generate offspring Sc. A number of genetic operators are available, for example, single point crossover, uniform crossover, simulated binary crossover (SBX) (Deb and Agrawal, 1994), simplex crossover (SPX), one bit-flip mutation, polynomial mutation (PM) (Deb

et al., 2002a), etc.. These genetic operators have their own advantages and dis- advantages. In this study the SBX and PM operators are chosen. It is worth mentioning that different genetic operators may lead different algorithm perfor- mance for different problems, that is, selecting suitable genetic operators is often algorithm- and problem-dependent (Srinivas and Patnaik, 1994). This is beyond the scope of this research but is worth investigating in future.

(ii) Note that using function goalGenerator, goal vectors inGc are randomly gener- ated as objective vectors directly in objective-space within the goal vector bounds,

GBounds. GBounds are determined by function goalBound based on all offline

members,BestF. Specifically, the lower bound,gmin, and the upper bound,gmax,

are estimated by Equation3.4:

gmin = min (BestFi), i= 1,2,· · ·, M

∆Fi = max (BestFi)−min (BestFi)

gmax = min (BestFi) +α×(∆Fi), α≥1, i= 1,2,· · ·, M

(3.4)

whereα= 1.2 is suggested. A further discussion on the configuration of parameter α is provided in Section3.5.

Genetic operators are not applied to generate offspring goal vectors. This is be- cause, based on our preliminary experiments, none of the classic genetic operators, such as SBX and SPX, work more effectively than the random method. A possible reason is due to the observation that applying genetic operators to goal vectors of- ten produces arbitrary goal vectors. This is similar to the general observation that, recombining two dissimilar candidate solutions often does not produce a fruitful solution; for this reason, mating restriction is often considered in genetic opera- tions (Ishibuchi et al.,2008a)). Although the random method might also generate some non-useful goal vectors, it guarantees that goal vectors in the entire objective- space are generated, and this helps the algorithm find solutions in all regions of the Pareto optimal front. It is expected that, in future, some effective genetic op- erators can be developed and so can be applied to goal vectors, further improving the performance of PICEA-g.

(iii) FunctionupdateArchive updates the offline archiveBestF by F S. For each so- lution (e.g. F si) in theF S, ifF si is dominated by a solution in the archive, then

F siis rejected. Otherwise it is accepted as a new archive member. Simultaneously,

solutions in the archive that are dominated byF siare removed. When the number

of archive solutionscASize exceeds the archive sizeASize, the clustering method employed in SPEA2 (Zitzler et al.,2002) (i.e., an archive truncation strategy) is in- voked which iteratively removes solutions from the archive untilcASize=ASize, and simultaneously maintains a set of evenly distributed solutions. The clustering method is described as follows: at each iteration, a member in the archiveF si is

chosen for removal for whichF si ≤dF sj for all F sj ∈BestF with

F si ≤dF sj :⇔∀0< k < cASize:distki =distkj∨

∃< k < cASize: [(∀0< l < k:distli =distlj)∧(distki < distkj)] (3.5) where distki denotes the distance of F si to its k-th nearest neighbour in BestF.

Equation3.5means that the most crowded solution (the one that has the minimum distance to another solution) is chosen for removal at each stage; if there are multiple solutions with minimum distance, the tie is broken by considering the second smallest distances, and so on.

With respect to the time complexity of PICEA-g, evaluation of a population of candidate solutions runs atO(M×N), whereM is the number of objectives andN is the number of candidate solutions. Fitness assignment for candidate solutions and goal vectors which needs a crossed comparison (to determine which candidate solution dominates which goal vector) runs atO(M×N×Ng). Therefore, the overall time complexity of PICEA-

g is O(M×N ×Ng). When candidate solutions are evaluated by the same number of

goal vectors, the time complexity of PICEA-g isO(M×N2) which is equivalent to the running time of NSGA-II (Deb et al.,2002a).