relationship with other ongoing negotiations. The point is that unlike a true auction the relationship with other simultaneously submitted offers is not specified up front, through a set of rules.
Reservation value
A drawback of the auction-inspired strategy is that it becomes vulnerable whenever groups of buyers experience very little time pressure. Without time pressure, buyers have no incentive to buy soon and could independently decide to initially submit very low offers; consequently profits will be very low. To circumvent this we also consider auction-inspired strategies with a reservation value(i.e., a lowest acceptable utility level). A seller agent is never willing to sell below the reservation value. This means we alter the earlier definition of the current highest utility level. It now becomes the maximum of the reservation value and the utility of the best offer from the offers collected within a certain time interval. An interesting advantages of introducing a reservation value occurs when some but not all buyers experience very little time- pressure. The auction-inspired strategy can then still utilise the time-pressure of the other buyers.
We consider two approaches for determining the reservation value. Either the reservation value is fixed, like the fixed-threshold strategy, or it is time dependent, like the time-dependent threshold strategy. Thus the auction-inspired strategy with a reservation value is actually a combination of the auction-inspired strategy (with- out reservation value) and either the fixed or time-dependent threshold strategies.
7.2
Bargaining simulation environment
We apply a simulation environment in order to evaluate the performance and robust- ness of the above negotiation strategies against many learning buyers. The agents in the simulation are assumed to be boundedly rational: they can learn and adapt their strategies by a process of trial and error, and they do not know the seller’s strategy. The bargaining process is repeated many times, enabling buyers and the seller to learn from past interactions. An evolutionary algorithm is used to model the learning aspect of the agents. A similar approach was used in previous chapters (Chapters 3-5).
7.2.1
The bargaining game
The seller agent negotiates with many buyer agents simultaneously by alternating offers and counter offers, where the buyer agents initiate the negotiations. For our simulations we set a maximum number of r rounds, where r is set sufficiently
134 Bargaining strategies for one-to-many bargaining
large such that it has no significant impact on the results. At the start of the negotiation, buyer agents submit their offers to the seller agent, which responds by either accepting an offer or sending a counter offer in the next round. Offers consist of a single issue, viz. the price of the negotiable good. Negotiation continues after all buyer agents have reached an agreement or the maximum number of rounds is reached, which concludes a so-called bargaining game. We note that buyer agents in the simulation do not leave the negotiations or enter later.
We assume that, since buyers are impatient, buyer agents in the simulation will respond to the seller agent’s counter offer without delay. This is modelled by having the buyer’s counter offer occur in the same round as the seller’s counter offer.
7.2.2
Buyers and their agents
Buyers are interested in buying at most one unit of the offered good in each bar- gaining game. They can have different preferences regarding their time pressure and valuation of the good, which together characterise the buyertype. For the analysis we assume buyers can be grouped into a finite number of k types. The number of buyer agents of each type participating in a negotiation game varies randomly and is unknown to the seller agent. The seller agent is also uninformed about the identity or type of a specific buyer agent. The actual number of participantsof each type is determined independently by a Poisson distribution with average λ.
A buyer agent tries to maximise a given utility function for buyer type i, ui,
which is defined as follows:
ui = (vi−p)δit, (7.1)
where vi is the buyer’s valuation of the good, p is the negotiated price, δi is the
discount factor used to model the time pressure, and t is the negotiation time. In the simulation negotiation occurs at fixed time intervals. Therefore,δ is the discrete representation of time pressure andt therefore also indicates the negotiation round. Note that discount factors are commonly used for modelling time pressure, e.g. in the Rubinstein-St˚ahl alternating-offers model (see Section 2.3.2). The agents are furthermore assumed to beindividually rational(see Def. 4.2): they will not bid nor accept offers with a negative utility.
Within the simulation, buyer agents are endowed with adaptive time-based strategies to produce offers and evaluate the seller’s offers. Although this is a rel- atively simple strategy, the adaptive nature of the strategies provides buyer agents with sufficient flexibility to bid effectively in the long run. A strategy consists of a piece-wise linear function, which determines the price level of new offers and is also used as threshold to accept or reject the seller’s offers: if the seller’s offered price is above the threshold, the offer is accepted, otherwise the offer is rejected. A post-agreement offer is automatically accepted if this is beneficial for the buyer.
7.2 Bargaining simulation environment 135
Figure 7.1: The EA cycle for negotiations with two buyer types and an adaptive seller
We also applied an extended strategy in our experiments, where the threshold and offers are determined by separate piece-wise linear functions. The separation of the two functions enhances the bargaining capabilities of the buyer agent. Results using the two representations are very similar. The outcomes presented in this chapter are obtained using the extended strategy.
7.2.3
Seller agent
The seller agent bargains with a number of buyers simultaneously, without knowing the type of these buyers. The seller agent’s utility is equal to the total utility or
profitobtained over all buyers (recall from Section 7.1.2 that the we can assume the seller has no time pressure). Production costs are set to zero.
We consider five strategies for the seller agent: fixed threshold, time-based threshold, auction-inspired strategies and two combined strategies. The first two strategies and the combined strategies are adaptive: strategies that maximise total utility are learned using an EA. The time-based threshold strategy is similar to the strategy used by the buyer.
7.2.4
The evolutionary system
Evolutionary algorithms (EAs) are used to produce effective bargaining strategies for the buyer agents and seller agent. The implementation used is described in detail
136 Bargaining strategies for one-to-many bargaining
in Section 1.2. The strategies for buyer agents of eachtype are produced by separate EAs, which operate in parallel. Furthermore, a separate EA can also be used to produce strategies for the seller agent, in case the seller uses an adaptive strategy. An example of the evolutionary system with two buyer agents and an adaptive seller agent is depicted in Figure 7.1. Note that, whereas in previous chapters a single EA was used with several evolving populations, the current implementation applies several (independent) EAs. This enables for instance the seller agent to use a different strategy representation than a buyer agent.
The fitness of the strategies is determined by the average utility obtained in a number of bargaining games, which go as follows. At the start of each bargaining game, the number of participating buyer agents of each type is determined randomly using a Poisson distribution as described above. Buyer agents are then assigned a randomly selected strategy from either the parent or offspring population of the corresponding type. Similarly, a strategy is selected randomly for the seller agent (in case of an adaptive seller). The bargaining game is played for a fixed number of times, re-establishing the number of buyer agents and assigning new strategies at the start of each game.
Strategy encoding
As mentioned in Section 7.2.2, the buyer agent’s strategy consists of two piece-wise linear functions: an offer and a threshold function. The functions are encoded using real values, where each bending point of a function is encoded by two real values. Additionally, two end points mark the values for the first and last rounds. For example, 8 real values are needed to encode a pair of functions with two line pieces each.
The same representation is used for the seller agent if he uses a time-based threshold strategy. If a fixed threshold is used, only a single real value is needed to encode this. Note that the seller agent uses the same function for both the threshold and for producing offers.