Computational Complexity and Game Theory - DESIGN, SIMULATION AND ANALYSIS OF COLLABORATIVE STR

The main solution concept in game theory is the Nash equilibrium [Lipton and Markakis,

2004], which explains why most studies on computational complexity focus on this concept. This explains why the computation of such equilibria is currently a very active research field. Papadimitriou and Roughgarden [2004] presented a polynomial-time algorithm for computing (finding in a given game) a Nash equilibrium as the “holy grail” of this research field. In fact,Papadimitriou[2001] thinks that this problem is not easy, i.e., it is harder than P (the class of problems considered as easy, because they can be solved in a polynomial time), even if it must be easier than NP -hard (NP for Non- deterministic P olynomial, one class of hard problems, because they cannot be solved in a polynomial time).

Similarly, it was shown that determining if Nash equilibria with certain natural properties (e.g., is the equilibrium Pareto-eﬃcient? is there more than one equilibrium? is there an equilibrium where player one never plays A?) exist is NP -hard, and the counting of Nash equilibria is #P -hard (another class of problems considered as hard)

[Conitzer and Sandholm, 2003].

As we focus in this dissertation only on pure strategies, instead of on all Nash equilibria, we can wonder if this hardness remains. Gottlob et al. [2003] answers “yes” to this question. In fact, they have shown that determining the existence of a pure Nash equilibrium is NP -hard, even in very restrictive settings. Fortunately, we can ﬁnd examples in which a Nash equilibrium can be computed in a polynomial time

[Fabrikant et al.,2004]. The good news is that determining the existence of a pure Nash

equilibrium and computing all such equilibria is feasible in logarithmic computational space [Gottlob et al., 2003], but Gottlob and his co-workers have said nothing about computational time.

To have an insight into the computational time for computing a Nash equilibrium, let us consider the method used in Section 4.2, in which each entry of the game is checked by hand. For instance, in the game “Battle of the Sexes” in Table 4.6, two players each have two possible strategies. Therefore, the game is represented by a2 × 2

matrix in which the2 ∗ 2 = 4 entries have to be checked. But when a third player with two possible strategies is added, the game is represented by a2 × 2 × 2 matrix in which 2 ∗ 2 ∗ 2 = 8 entries have to be checked. This algorithm thus incurs a combinatorial explosion, because there is an exponential relation between the number of entries to check and the number of players. This shows that the algorithm used to ﬁnd Nash equilibria by hand is exponential, that is, NP , whilePapadimitriou[2001] hopes that a quicker algorithm exists. This better algorithm is required if we want to analyze games with many players.

From a more practical point of view, the above results imply that when we enu- merate all pure Nash equilibria, the calculation may last a very long time (because time complexity is thought to be harder than P ), without requiring an excessive mem- ory (because required space is logarithmic [Gottlob et al., 2003]). Therefore, we need a means to accelerate the enumeration of pure Nash equilibria in Chapter 8. This is the reason why one would ﬁrst simplify the game by removing all strictly dominated strategies. In fact, removing all strictly dominated strategies accelerates the enumeration of Nash equilibria without loosing any of them [McKelvey et al., 2004].

McKelvey et al. [2004]’s Gambit 0.97.05 achieves these tasks of eliminating recursively

strictly dominated strategies and enumerating pure Nash equilibria for us. Gambit is a free software licensed under the Free Software Foundation [2004b]’s GNU General Public License for analyzing games according to game theory principles. Gambit can be used to ﬁnd pure Nash equilibria. Here, Gambit applies an algorithm based on a method called Simplicial Subdivision [McKelvey and McLennan, 1996], but its complexity is unknown [Lipton and Markakis,2004]. We only know from some experiments that it is a complex algorithm.

4.4 Conclusion

As the bullwhip effect can be seen as a problem of coordination in supply chains, this chapter has presented how to coordinate supply chains and multi-agent systems. With respect to coordination in these two fields, we have pointed out some similarities and some differences between these two fields, which is a contribution of this dissertation.

As coordination is related to interactions, we have next presented game theory as a tool to study these interactions. In particular, we have introduced this theory, and presented its essential concepts. Next, we have illustrated this theory with some examples.

Finally, we have outlined the diﬃculty of applying game theory to computer sci- ence. Such application is recent, but researchers think that computing game-theoretic concepts cannot be considered an easy problem, i.e., a problem that can be solved by a polynomial-time algorithm.

The next chapter is the first one describing the core of our research. To this effect, we show how stream fluctuations are induced in distributed systems, and we propose a coordination technique to limit this cause of stream fluctuations.

Delays as a Cause of The Bullwhip

Eﬀect

The previous chapter has concluded the introduction of the background of this research with the presentation of coordination and game theory. In this chapter, we present the theory on which our solution to stream fluctuations is based. For that purpose, we show in Section 5.1 why delays lead to an increase of stream fluctuations in the particular case of supply chains. Two principles are next suggested to reduce this cause of stream fluctuations in Section 5.2. These two principles are then instanciated in two ordering schemes for the Québec Wood Supply Game (QWSG) in Section5.3. The supply chain behaviour induced by one of these two ordering schemes is then presented in Section5.4. Section5.5illustrates the use of this ordering scheme with a more realistic example than the QWSG.

Since we introduce and illustrate how delays incur stream ﬂuctuations, and how we propose to reduce this issue on the case of supply chains, we show in the last part of this chapter, how to adapt this solution to any multi-agent system. To this end, we show in Section 5.6, that a simple replacement of words translates the content of this chapter into a multi-agent context.

5.1 Why Delays Cause the Bullwhip Eﬀect

The Québec Wood Supply Game (QWSG) was designed to teach the phenomenon of stream ﬂuctuations in supply chains. We recall that stream ﬂuctuations in supply

chains are also known as the bullwhip eﬀect. In the QWSG, each player takes the role of a company that places orders according to its incoming orders and inventory level. This game simulates product and ordering streams, and the only decision taken by players concerns the placement of orders. Hence, the bullwhip eﬀect in the QWSG is due to the method used by players to place orders. We present the QWSG in details in Subsection 6.1.2.

It is important to note, that in our research we replace human players with intelligent agents, and orders placed by these agents are ruled by their ordering scheme. Our problem in this chapter is to propose a behaviour to a company, so that the bullwhip effect is reduced in the QWSG. Before doing that, we have to understand what makes the bullwhip effect appear. In this section, we first show that only ordering and shipping delays can explain the bullwhip effect our model and with intelligent agents, then we detail how delays induce this effect.

Several causes of the bullwhip eﬀect were proposed in the literature for real supply chains, but the QWSG is so simple, that few of these causes occur in it. From our point of view, only two of the proposed causes of the bullwhip eﬀect, recalled in Section 3.2, can be found in the QWSG and in the Beer Game: demand signal processing and misperception of feedback.

1. Demand signal processing: This cause of the bullwhip eﬀect was proposed by

Lee et al. [1997a,b]. They explain that each company uses its incoming orders to

forecast its future demand, while this demand can be very different from market consumption because of clients’ forecasts. Since no forecasts were used in the ordering rules used by our intelligent agents playing the QWSG1, this can explain the bullwhip effect in the board version of the QWSG, because human players intuitively forecast their future incoming demand, but not in our simulation. 2. Misperception of feedback: The second possible cause of the bullwhip effect in the

QWSG, was proposed by Sterman [1989], who studied the behaviour of human players in the “father” of the QWSG, i.e., in the Beer Game. According to Ster- man, players do not understand supply chain dynamics, and thus, do not exhibit the best behaviour when they place orders.

Only the above second cause could explain why we still had a bullwhip eﬀect when software agents replaced human players. In fact, players’ understanding of the supply chain dynamics was not used directly in our experiments, but the ordering policies that 1_{In fact, we assume there is no forecast, because we base the current order on the last demand, but}

they apply could be designed so that these dynamics are taken into account. When we looked for eﬃcient ordering policies, we found that the cause “misperception of feedback” can be detailed as “ordering and shipping delays”. In other words, we can see ordering and shipping delays as one of the reﬁnements of the cause “misperception of feedback”.

Since only ordering and shipping delays can explain the existence of the bullwhip effect when agents play the QWSG, we now show why ordering and shipping delays induce the bullwhip effect in our model. In deed, the ordering scheme that we give to agents has to take this cause into account in order to reduce the bullwhip effect, and if possible, to minimize also company-agents’ individual cost and/or the overall supply chain cost. We tried several ordering schemes (that are not presented here), but these schemes either induced the bullwhip effect, and/or did not manage inventory. The problem of designing these schemes is now outlined. Outlining this problem allows understanding why delays incur the bullwhip effect.

The basic idea to avoid the bullwhip effect in the QWSG is that, if companies’ orders follow their clients’ demand with a lot-for-lot ordering policy, there is no bullwhip effect, but inventories fluctuate greatly. In other words, either there is a bullwhip effect or inventories fluctuate greatly. This fact is illustrated in Figures 5.1(a) and 5.1(b), that represent a company travelled by an ordering and a product stream. In these figures, we assume each company places orders strictly equal to its demand, following a strict lot-for-lot ordering policy. We now detail Figure 5.1(a) in four points, to show why companies prefer the bullwhip effect, rather than use the lot-for-lot policy.

1. The lot-for-lot ordering policy eliminates the bullwhip effect, because each company has the same ordering pattern as its client and thus, as the market consumption. Therefore, the two curves Incoming orders and Placed orders are identical. Since the bullwhip effect is measured as the standard-deviation of placed orders, we can see that the standard-deviation of each company’s orders is exactly the same as the standard-deviation of its client’s orders, and therefore, as the standard-deviation of the market consumption2. This explains why a lot-for-lot ordering policy eliminates the bullwhip effect.

2. The considered company tries to fulﬁll its entire demand, and thus, the two curves 2_{We can notice here that companies could smooth the market consumption when they place orders,}

that is, companies could reduce the bullwhip effect by ordering in a more steady way than the market consumes. In this case, at least one company would absorb order fluctuations by allowing its inventory to fluctuate, and thus, this company would have a higher inventory level than what is made in this dissertation. We do not focus on such a smoothing technique, whereas it could be an interesting future work.

Incoming orders

Outgoing transport Incoming transport

O O Inventory COMPANY Placed orders δ

(a) Lot-for-lot ordering policy

Incoming orders

Outgoing transport Incoming transport

O O Θ Inventory COMPANY Placed orders δ

(b) Lot-for-lot ordering policy, market consumption

Figure 5.1: Lot-for-lot ordering policy with (O, Θ) orders.

Incoming orders and Outgoing transport are the same, that is, as many products are shipped as ordered. The curve Outgoing transport in Figure 5.1(a) is valid as long as no stockouts occur by the considered company.

In short, these two points say that the three curves Incoming orders, Placed orders and Outgoing transport are similar. The third point below says that the fourth curve also has the same pattern, but with a temporal shift.

3. The fourth curve Incoming transport has the same pattern as the three other curves, except that it is delayed by δ in comparison with the three other curves. This curve represents the reception of products by the company. This temporal

shift corresponds to the ordering and shipping delays, because items ordered by the company are not immediately received. The problem with lot-for-lot orders is that the inventory is not managed, because this temporal shift makes inventory decrease (respectively increase, when we inverse the pattern of “Incoming Orders”). In fact, the company ships more (respectively less) products than it receives during the ordering and shipping delays.

Note that Incoming transport has the same pattern as the three other curves only when the supplier has no stockouts, because the supplier is assumed to want to fulﬁll its entire demand, like the considered company.

4. Since every company wants to avoid stockouts (respectively huge inventory), rather than eliminate the bullwhip effect, it does not use the lot-for-lot ordering policy. Instead, it overorders (respectively underorders) to stabilize its inventory, which amplifies the demand variabilities, because the company overorders (respectively underorders) when the demand increases (respectively decreases). This shows that the bullwhip effect always appears each time the market consumption has an infinitesimal change, if companies want to keep a steady inventory3_.

We can note here, that some of the other causes of the bullwhip effect presented in Section3.2, induce the bullwhip effect even with a steady demand, while delays amplify order fluctuations, but do not induce fluctuations when the demand is steady. Specifically, the bullwhip effect amplifies, because if a retailer overorders to stabilize its inventory, a worse phenomenon takes place with its suppliers: since the demand variation is now bigger, their inventories decrease much more, and thus they must overorder more.

As we can see, our problem is not only to reduce the bullwhip effect, because company- agents in the QWSG only have to apply the lot-for-lot ordering policy to eliminate this effect, but we also have to manage inventories in order to avoid stockouts and high inventory levels. In our solution, we propose an ordering policy with a unique order amplification for each change in market consumption. Since companies have to know the market consumption, this solution is the same as the one proposed by Lee and his colleagues to improve demand forecasting updating, because companies have to share their incoming orders information with their suppliers. Precisely, companies signal to their suppliers when they over- or underorder. This information sharing is presented in Figure5.1(b), in which each company uses a vector (O, Θ) of two orders:

3_{This is true when smoothing is not considered. Figure} _5.1(a) _{shows that if companies smooth}

their demand when they transmit it to their suppliers (in placed orders), their inventory will fluctuate, as stated in the previous footnote. Therefore, a company reducing the bullwhip effect with some smoothing technique has to increase its inventory to avoid stockouts. In other words, either inventory fluctuates to place steady orders, or orders fluctuate to stabilize inventory. This increase of inventory costs to the company money, but only its suppliers directly profit from the bullwhip effect reduction.

1. orders O follow the lot-for-lot policy to avoid the bullwhip eﬀect;

2. orders Θ are used to order more or less products than O to stabilize inventory level.

We now present the two principles ruling the use of O and Θ.

In document DESIGN, SIMULATION AND ANALYSIS OF COLLABORATIVE STRATEGIES IN MULTI-AGENT SYSTEMS: THE CASE OF SUPPLY CHAIN MANAGEMENT (Page 110-118)