• No results found

4.3 Collaborative Solution Generation: Passive Learning

4.3.3 Proposed Approach

Figure 4.1 depicts a partially distributed setup where collaborative participants may join from their respective depots with individual capacity of commodity delivery. The setup performs two main distributed operations, namely: choosing customer nodes for commodity delivery and computation of transportation cost. However, aggregation of the total cost of solution and decision making for the update of weight vectors at the depots side are performed centrally. Although, it is possible to implement a fully distributed setup without any central authority to evaluate the total commodity delivery cost along with the decision

Template of weight vector for participant p Template of weight vector for participant p Template of weight vectors for participant p Start Choose customer nodes for depot p

Compute service cost for depot p

Is it a better cost of delivery?

Update weight vectors

Stop

Aggregate total service cost

Is stop condition matched?

Distributed Centralized yes yes no no Initialization

Figure 4.1: Collaborative solution generation for multi-depot vehicle routing problems

making for the update of weight vectors, it needs extensive peer-to-peer communication among depots. This, in turn, reduces the eciency of the system. In what follows, we discuss a four-step task sharing based collaborative optimization (see Section 2.1.3 for details) in this setup. To simplify the discussion, we elaborate the task decomposition as the last step.

Task Allocation: In the aforementioned setup, at round t, depot p's likelihood to serve customer i is represented by its weight wt

i,p in the template Wpt as maintained at

participant p. A decision maker, who is associated to a depot, selects a customer node if and only if its conclusive dominance was previously established over the respective node. Then, at each round, every decision maker decides over the subset of customers under his/her dominance whether to keep them under their dominance or not. Each decision is made using a pseudo random function frandom([wti,1, wi,2t , . . . , wti,m])at participant p if

node i is under p's dominance. The output of the frandom function is a depot that is

chosen in a biased random fashion. The bias for a particular depot is generated using the input weight vector. At round t, if the output indicates that the depot p is itself, then the

customer node is required to be served by p. If the output is another depot p′, then at

that particular round, it will be served by p′. Depots need to unambiguously ensure the

responsibility of serving a customer node to one and only one depot.

The aforementioned sub-problem design is important for four main reasons. First, it divides the original problem instance into multiple sub-problems. Second, it assures that at any round only one depot decides on the commodity delivery to a customer. Third, it also ensures that only one depot remains responsible to serve a customer. Finally but most importantly, frandom function oers an error mitigation strategy for near-optimal solution

search. Usually, during a multi-round solution generation, dominance of a particular depot generally increases over a customer node in each round which helps the convergence of the heuristic/meta-heuristic solution search. However, it may also lead the solution search to a local optimal solution. A fairly designed random function oers a lower probability for a customer node to be served by other depot(s) than its dominating depot. It allows the so- lution search to reassess the potential of slightly dierent customer assignment possibilities which may lead the search process toward global optimal solution.

However, a task allocation does not guarantee existence of a solution since a partic- ipant is not aware of the capacity of others by design. Eq. (4.10) reects this uncertainty in risk allocation. Therefore, multiple randomization (calling frandom function) may be

required to reach a distribution where capacity of participants are enough to compute a solution. Thus, we handle uncertainty in risk allocation. Once the participants agree to start computing a solution, Eq. (4.8) is satised.

Task Accomplishment: The task of computing routes is performed in a distributed setting by individual participants at every round. The cost computation of individual SD- VRP instance is performed by applying heuristic technique followed by a meta-heuristic improvement. We use the heuristics dened in Algorithm 1 and associated meta-heuristic techniques to compute the serving cost of each individual depot for its respectively assigned customers (no commonly shared problem instance is involved among depots). Every indi- vidually computed solution must satisfy Eq.s (4.4)-(4.7) of the distributed problem model. Result Synthesis: Depots share their computed cost by communicating to a central entity in order to calculate the total cost of service. If the total cost is found better than

the previously best found overall cost of the same MDVRP instance then it represents a new minimum solution cost for the original problem instance. Then, an Update request is sent to each participating depot to consider this assignment of customer nodes to be part of their decision making for the next rounds.

Task Decomposition: Each depot individually adjusts weights in its weight vectors while learning the situation with respect to the overall outcome of customer assignment. As detailed in Section 4.3.1, boosting potentially increases the likelihood of choosing an optimal allocation. During successive customer allocation round, customers are gradu- ally allocated more appropriately toward a near-optimal solution. Thus, the interpretable function fi(utip)dominantly contributes in determining the response of serving a customer node by depot p. While certain allocations of customer nodes easily reect their domi- nating depots, others may keep changing their dominating depots. With the progress of the multi-round allocation procedure, the undelying boosting mechanism handles these customers with increasingly less options of depots for nal allocation. Thus, in the col- laborative MDVRP model, we handle Eq. (4.11) on the decision making for customer nodes.

In the solution search, task decomposition is critical for the convergence and success of the proposed approach. We propose a LogitBoost based mechanism [81] to form a strong additive learner model from the elitist solutions to update the weights. This procedure is unique from two relevant aspects.

ˆ Elitist Solutions: Elitist solutions are determined using a gap value with respect to the currently best found solution (denoted as: CurrentBest) as reference. Let st

min

denote the current best solution then the subset of solutions having cost within a specied maximum gap (gap) are considered as elitist solution pool. For example, with a 10% gap, all solutions are considered elitist where the solution cost is not more than 1.1 × st

min. It is important to note that, an elitist solution is chosen best

on overall cost instead of the contribution of a depot p in this particular solution, denoted as spj. This requires collaborative decision making since the participants

ˆ Dominance Selection: In each round, a sorted array of elitist solutions in descending order of solution cost is used to determine the dominance of a depot. First, a binary function bj(i, p) is used to identify if customer node i is allocated to depot p in the

elitist solution j. Second, a function rankj(p) uniquely determines the position of solution j at depot p. The CurrentBest solution always has ranking 1. Then, the weight wt

i,p can be calculated as ∑rj=12rankj (p)1 ×bj(i, p). r denotes total number of

elitist solutions.

As we see, the dominance selection uses a polynomial series (∑∞

j=121j =1) which assures

updated weight wt

i,p for a customer node i will never reach 1 for a depot p, in practice.

It also arms that there is always a chance for a customer node to be served by another depot using frandom function even when that depot is not dominating the customer node.

The update of weight is based on an estimation with respect to which node allocation produces a better solution, at the end of each round. If node i is served by a depot p in most of the competitive solutions with lower routing cost, it is most likely to be served by p. However, even if customer i is served by depot p only in the CurrentBest solution, still customer i is conclusively under the dominance p (assuming more than 2 depots serve customers).

Thus, we dene a weight adjustment function adjustp(. . . )to update weights of cus-

tomer nodes in depot p between successive rounds. Two additional implementation-specic thresholds are used, namely maxconf and minconf. They are user-chosen and they re- strict updating weight higher or lower than these chosen values to give every customer node a fair chance to be served by dierent depots. adjustp(. . . )can be described as follows:

wt+1i,j = ⎧ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎩ ρ = M ax(M in(∑rj=1 1

2rankj (p) ×bj(i, p), maxconf ), minconf ) j = p wt

i,j×(1−ρ)

1−wt

i,p otherwise