The DCG algorithm is only designed to organise agents for the first stage of coalition forma- tion. To complete the coalition formation process in a desirable manner, the agents will need to collaborate to find an optimal coalition structure and a stable payoff vector. Additionally, the approach that the agents take, needs to take into account the distributed knowledge of the coalition values that the DCG algorithm has created. This is the aim of thedistributed dynamic programming(DDP) algorithms introduced in the next chapter.
A Distributed Search for the
Superadditive Cover Least Core
The computational cost of calculating the value of every coalition grows exponentially as the number of agents in the system increases. To tackle this issue, these calculations can be dis- tributed among the agents of the system, as the previous chapter suggests. Yet distributing the value calculations means that each agent of the system will have only partial, incomplete knowl- edge of the characteristic function game.
Given this distributed knowledge, this chapter investigates and provides a method to solve the following issues1: (1) how can a stable core-based solution to a characteristic function game be guaranteed to be found in a decentralised manner?; and (2) how can this stable solution be found when thevalues of the stable core-based solutions are not known beforehand?
The decentralised dynamic programming algorithm introduced in this chapter, named DDP, is a modification of the dynamic programming (DP) algorithm introduced in [138] and detailed previously in Section2.4.2. The DDP algorithm shows how the agents can complete all three stages of the coalition formation process in a distributed manner. It can be used for any charac- teristic function game, and is guaranteed (through Lemma5.1, Theorem5.2and Corollary5.3) to find a coalition structure and payoff vector distribution in the solution concept of theweak least corefor the superadditive cover of the characteristic function game when cross-coalition side payments are allowed2. This guarantee comes from the fact that the DP and DDP algorithm can be used to identify the coalitions in the synergy coalitional group representation [41,81].
The contributions of the DDP algorithm are that: (i) a stable solution is found with dis- tributed knowledge, which is equivalent to the complete knowledge solution; (ii) this stable solution is guaranteed to be in theweak-least core+solution concept for the superadditive cover of the characteristic function game, where the plus denotes that cross-coalition side payments are allowed; and (iii) every agent is motivated to perform its part of the algorithm.
1
An early attempt at solving issues (1) and (2) of this Chapter was presented via an argumentation-based dialogue game in [100].
2
I.e. The total payoff the agents of a formed coalition receive can be less than the coalition’s total value when cross-coalition side payments are allowed.
The rest of this chapter is structured as follows: Section5.1provides a detailed discussion on the solution concept guarantees of the DDP algorithm; Section5.2details the DDP algorithm; Section 5.3describes how the communication costs of DDP can be lowered and constrained; Section 5.4 provides an example of how the DDP algorithm works, once the communication costs have been constrained; Section 5.5 evaluates the DDP algorithm according to the com- municated information, the number of agent operations, and the solution concept success rate; Finally Section5.6concludes.
5.1
Guaranteeing Stable Solutions
When the characteristic function game is superadditive (see Section2.1for definition), then the grand coalition is the optimal coalition structure [36]. In this chapter, thesynergy coalitional group(SCG) representation (detailed in Section2.4.1) is used to guarantee that a least core stable solution is found. In SCGs, a coalition and coalition value pair (i.e. (C, v(C))) is explicitly represented within the set of synergy coalitions, denoted W, if that coalition C earns more utility from forming then any possible partition of C could receive. The values of any other coalition S not represented withinW is found by finding the maximum value partition of S made up of coalitions explicitly represented withinW. In Lemma 2 and Theorem 4 of [41] it is proven that the setW allows a core solution to be found if the core is non-empty. This Lemma and Theorem have been modified below to show that a payoff vector in the weak least core can also be guaranteed to be found for the grand coalition using theW set:
Lemma 5.1. Given a superadditive characteristic function gameG=hN, viand a payoff vector x(wherex(N) = v(N)), letw be the maximum weak excess value that still gives a blocking coalition forxgiven full knowledge on each coalition’s value. Using only the coalition values inW, a blocking coalitionDfor this maximum weak excess value will be present withinW. Proof. Supposexis blocked by a coalition C throughv(C)−|C|wso thatv(C)−|C|w > x(C)
and∀w0> wthere is no blocking coalition. If(C, v(C))∈W then this proves the Lemma. If
(C, v(C))∈/ W then from the definition of a SCG, it is known that the value of the coalition can be found through its maximum value partition, i.e.v(C)− |C|w =P
1≤p≤q(v(Cp)− |Cp|w), for some set of coalitions{C1, ..., Cq}where:
1. Sq
p=1Cp =C
2. Ci∩Cj =∅for anyi, j∈ {1, ..., q}wherei6=j
3. (Cp, v(Cp))∈W, for allCp ∈ {C1, ..., Cq}
Via substitution, it follows thatP
1≤p≤q(v(Cp)− |Cp|w) > x(C)and hence for at least one Cp then v(Cp)− |Cp|w > x(Cp). Therefore if the value of a coalitionC that is not in the SCG representation blocks a payoff vectorxgiven the maximum weak excessw penalty that still gives a blocking coalition, it has been shown that there exists a coalitionCp ⊂C, that also blocksxgiven the maximum weak excesswpenalty, yetCpis represented explicitly within the SCG representation (i.e.(Cp, v(Cp))∈W). Thus the proof of the Lemma is complete.
Lemma5.1shows that given a superadditive characteristic function game, the grand coali- tion and a payoff vector distributing the grand coalition’s value, a coalition that blocks the pay- off vector for the maximum weak excess (that allows a blocking coalition) will be present in the SCG representation. For further understanding, consider the following example:
Example 5.1. Consider a characteristic function game withN ={1,2,3,4}agents, where the coalitionC = {1,2,3}has a value ofv({1,2,3}) = 25. Given a payoff vectorx, that gives the coalitionCthe total valuex(C) = 18, it is known that the maximum (integer)weakexcess to still give a blocking coalition is w0 = 2. For instancex(C) = 18 < v(C)− |C|w0 = 25−(3×2) = 25 −6 = 19, and so coalition {1,2,3} is a blocking coalition forx when w0 = 2but not a blocking coalition whenw0 = 3.
IfC ∈ W then a blocking coalition for the maximum weak excess value (that still admits a blocking coalition) has been found in the SCG representation. IfC /∈ W then there must be coalitions withinW that make up a partition ofCand have an equal or greater combined value thenC. In this example assume({1,2}, v({1,2}) = 16),({3}, v({3}) = 9) ∈W. The values of these coalitions have been chosen becausev({1,2}) +v({3}) =v({1,2,3}), i.e. the values are the minimal needed to make sure that coalition{1,2,3}is not in the SCG representation.
Given theses preliminaries, to stop coalition{1,2}being a blocking coalition whenw0= 2, thenx({1,2})must be greater than or equal tov({1,2})− |{1,2}|w0 = 16−(2×2) = 12. Assume thatx({1,2}) = 12(i.e. the minimal payoff to satify the coalition has been given).
Recall x(C) = 18. Therefore x3 = x(C)−x({1,2}) = 18−12 = 6. But this gives x({3}) = 6 < v({3})− |{3}|w0 = 9−(1×2) = 9−2 = 7and so the singleton coalition {3}is a blocking coalition forxwhenw0= 2and this blocking coalition has been found in the SCG representation.
In conclusion this example shows that if a blocking coalition C with the maximum weak excess (that still admits a blocking coalition) is notpresent in the SCG representation, then a coalitionC0 ⊂Cwith the same weak excess value will be present in the SCG representation.
The following theorem shows that given the grand coalition for a superadditive game, the coalitions in the SCG representation can be used to find a payoff vector within the weak least core:
Theorem 5.2. For the grand coalition, a payoff vector that minimises the maximum weak excess can be found using only the coalitions inW.
Proof. The linear program to minimisewis:
minwsubject to: (5.1)
xi ≥0for eachi∈N (5.2)
x(N) =v(N) (5.3)
x(C)≥v(C)− |C|w, for all(C, v(C))∈W (5.4)
Before introducing the corollary to this theorem, a formal definition regarding cross-coalition side payments are required, which are shown through the weak-core+ and weak least core+ definitions:
Definition 80: Theweak-core+:- For a characteristic function gameG =hN, vi, a coalition structure and payoff vector pairhCS∗, xiis in the weak-core+iff:
X i∈N xi=CS∗ (5.5) X i∈C xi ≥v(C)− |C|, ∀C ⊆N (5.6)
The difference of the weak -core+ compared to the weak-core of Section2.2.1 is that a weak -core+ payoff vector totals the value of the optimal coalition structure, which may not be the grand coalition. The difference of the weak-core+ compared to the-CS-core of Section2.2.3 is that the weak-core+ does nothave the condition that all the payoff of each coalition in the coalition structure is given to that coalition (i.e. x(C) = v(C),∀C ∈ CS∗ doesnothave to hold). Given the definition of the weak-core+, the weak least core+can be defined:
Definition 81: weak least core+:- For a characteristic function gameG = hN, vi, a coalition structure and payoff vector pairhCS∗, xiis in the weak-core+iff:
hCS∗, xi is in the weak-core+ ∀0< , the weak0-core+is empty
Corollary 5.3. For the grand coalition of a superadditive cover game, a payoff vector that minimises the maximum weak excess of the superadditive cover can be found using only the coalitions inW.
Proof. Replace the characteristic functionvwith the superadditive cover characteristic function v∗ in Lemma 5.1 and Theorem 5.2 (meaning the grand coalition’s value will be equal to the optimal coalition structure), to guarantee a weak least core+solution for the superadditive cover.
Given Corollary5.3, the DDP algorithms introduced in this chapter identify the coalitions in thesynergy coalitional group(SCG) representation and the coalition structure that maximises the value of the superadditive cover of the grand coalition to guarantee that a weak least core+ stable solution is found. If cross-coalition side payments were not used, then it would be much more difficult for the agents to reason over what is the most stable coalition structure and payoff vector pair, as [112] showed the optimal coalition structure may not be the most stable in this situation and so multiple coalition structures will have to be compared via not just their value, but also their stability.
Searching all possible coalition structures is a highly complex task because the total possible number of coalition structures grows at a significantly higher rate than the number of potential
coalitions [104]. For nagents, the number of possible coalition structures is found using the Bell number Bn, which is∼θ(nn)and so significantly larger then theθ(2n)growth of possible coalitions [104]. It is of great benefit to the agents to significantly minimise the number of possible coalition structures to compare.
Additionally, as noted in [2,66], allowing cross-coalition side payments can benefit multi- agent systems, as it was argued that introducing cross-coalition side payments can be considered a more fair payoff mechanism then disallowing cross-coalition side payments The additional fairness comes from eliminating the effect of the coalition structure on agent payoffs. The example given in [2] is that it may be possible that in the optimal coalition structure some agents M ⊂ N are by themselves in singletons, or are members of a comparatively small coalition compared to the size of the other coalitions in the coalition structure. When cross-coalition side payments are not allowed, theseM agents do not benefit from the cooperation of others, even when theMagents are in many potential coalitions that have a high value, that the agentsN\M may have used to negotiate for a better payoff. Yet, for the greater good of the population, i.e., maximizing social welfare, theMagents may be forced to stay in the optimal coalition structure. In this chapter, the DDP algorithm assumes cross-coalition side payments are allowed be- cause it significantly reduces the computation costs of the agents as multiple coalition structures will not have to be compared according to their stability.