Solution similarity measure - Bi-objective Evolutionary Algorithm

4.3 Bi-objective Evolutionary Algorithm

4.3.1.2 Solution similarity measure

The concept of population diversity not only refers to the number of distinct solutions there are in the population, but also to how different solutions are among them. It is relatively simple to determine the number of distinct solutions there are in a population and to make sure there are no duplicates, however, evaluating how solutions are spread in the search space generally requires the use of encoding- specific and problem-specific tools. If this information is known, it could be used to boost and maintain population diversity.

To accomplish this, the obvious starting point was the edit distance, introduced in Section 2.5.4, however, as will be seen later, it is computationally intensive. Thus a new similarity measure for solutions to the VRP was designed and the two methods are compared in Section 4.3.2.1. The designed solution similarity measure is based on Jaccard’s similarity coefficient [131, 73], which measures the similarity of two sets as the ratio of the cardinality of the intersection to the cardinality of the union of those sets. Formally, the Jaccard similarity J(A, B) of sets A and B is:

J(A, B) = |A ∩ B|

|A ∪ B| . (4.2)

Thus if both sets contain the same elements, the intersection will equal the union, and J(A, B) = 1. On the other hand, if the sets do not share any element at all, the intersection will be the empty set, i.e. |A ∩ B| = 0, and J(A, B) = 0.

The natural way to implement Jaccard similarity for solutions to the VRP is to consider each solution R as the set of segments or arcs u(i, k), u(i + 1, k) of each route rk, i.e. R = [ rk∈ R Nk [ i= 0 u(i, k), u(i + 1, k) . (4.3) Then, the similarity of two solutions equals the ratio between the number of arcs that are common to both solutions and the total number of arcs used by them.

(a) Solution R (b) Solution Q

Figure 4.7: Two potential solutions to an example instance of the VRP. The nine continuous lines in both solutions represent the arcs that they have in common. In total, 18 different arcs are used, therefore ςRQ = 9/18 = 0.5

Denoting yijR = 1 if arc (vi, vj) from customer vi to customer vj is traversed by any

vehicle in solution R, and 0 otherwise, the similarity ςRQ between solutions R and

Q is ςRQ = X i∈ V X j∈ V yijR· yijQ X i∈ V X j∈ V

sign (yijR+ yijQ)

, (4.4)

in which the term in the sum in the numerator will only equal 1 if arc (vi, vj) is

used by both solutions, while that in the denominator will equal 1 if either solution uses it. Note that arcs (vi, vj) and (vj, vi) are considered to be different, even if the

cost of traversing them is the same, since we are interested in measuring solution similarity on the solution space and not in the objective space. Hence, if solutions R and Q are the same, that is if they use the same arcs, ςRQ = 1, while if they

are two completely different solutions with no arc in common, ςRQ = 0. Figure4.7

shows two potential solutions to an example instance of the VRP, where the nine continuous lines in both solutions represent the arcs they have in common. Solution R uses four additional arcs, while solution Q uses five more. In total, 18 different arcs are used, therefore ςRQ = 9/18 = 0.5.

Algorithm 4.2: ComputeJaccardsSimilarity(R, Q)

Input: Solutions R and Q to the VRP

Output: Jaccard similarity ςRQbetween solutions R and Q

1: E ← ∅

2: for all arc (vi, vj) in R do

3: E ← E ∪ {(vi, vj)}

4: end for 5: shared ← 0 6: total ← |E|

7: for all arc (vi, vj) in Q do

8: if (vi, vj) ∈ E then 9: shared ← shared + 1 10: else 11: total ← total + 1 12: end if 13: end for 14: return shared/total

Algorithm 4.2 is used to compute the Jaccard similarity between solutions R and Q. The two loops in lines 2–4 and 7–13 are executed at most 2N times, because there can be a maximum of N routes, i.e. one route per customer. This means that the worst-case time complexity of this algorithm is O(N ).

For the purposes of the proposed algorithm, a measure of how similar a solution is to the rest of the evolutionary population is also required. If P is the population of solutions, the similarity σR of solution R ∈ P with the rest of the solutions in the

population will be given by the average similarity of R with every other solution Qi ∈ P , that is σR = 1 popSize − 1 X Qi∈ P \ {R} ςRQi . (4.5)

Algorithm 4.3 computes this similarity. Line 7, where the similarity between solutions Ri and Rj is calculated, is executed popSize(popSize − 1)/2 times, i.e.

O(popSize2_{). Thus the total time complexity of this algorithm is O(N popSize}2_).

Algorithm 4.3: ComputePopulationSimilarity(P )

Input: Population P = {R1, . . . , RpopSize} of solutions to VRP

Output: List of similarities S = [σ1, . . . , σpopSize] corresponding to solutions Ri∈ P

1: for i ← 1 to popSize do 2: σi← 0.0 3: end for 4: S ← [ ] 5: for i ← 1 to popSize do 6: forj ← i + 1 to popSize do 7: simij ← ςRiRj 8: σi← σi+ simij 9: σj← σj+ simij 10: end for 11: σi← σi/ (popSize − 1) 12: S[i] ← σi 13: end for 14: return S

Finally, we define the Jaccard diversity δJaccard(P ) of solutions in P as one minus

the average solution similarity, i.e. δJaccard(P ) = 1 − 1 popSize X Ri∈ P σRi . (4.6)

These solution similarity and population diversity definitions are part of the main contributions of this research. It is important to mention that one of the advantages of the Jaccard similarity measure is that it does not depend on the solution encoding, since only the information about the arcs forming the routes is required and this is known independently of the representation being used. For the same reason, another advantage is that this measure can be used for any variant of the VRP, since solutions to these problems can be represented as a set of arcs. Moreover, the worst-case time complexity of the solution similarity algorithm is O(N ), and that of the population similarity is O(N popSize2_).

In document Multi-Objective evolutionary algorithms for vehicle routing problems (Page 156-160)