Machine Learning Approaches - The Case for Concurrent Bilateral Negotiation

2.3 Concurrent Bilateral Negotiation Strategies

2.3.1 The Case for Concurrent Bilateral Negotiation

2.3.2.2 Machine Learning Approaches

Since basically everything can change at any time and analysing the situation fully is simply impossible, we cannot find a perfect tactic that would work well in all circumstances. In addition, the situation may be so problematic that even devising successful heuristics at design time may be difficult. Therefore learning how to behave in different negotiation and market situations might be useful (Zeng and Sycara 1998). In addition, our environment, with potentially hundreds of agents, is probably too large, dynamic and unpredictable to devise heuristics that would work in every possible contingency. Thus the one possible way to cope with this is to allow the individual agents to improve their own performance (Sen and Weiss 1999).

In more detail, the literature on learning in negotiations has concentrated on two main things:

a. recognising the negotiation tactic the opponent uses and b. finding a good tactic against it.

These problems are obviously connected. In competitive bilateral negotiations, especially in dynamic environments, the opponent can use very different tactics

and there is no single best one that does well against all possibilities. Thus, the best tactic usually depends on the opponent’s choice of tactic. This means that tactic recognition would seem to be the key. We will assume that once the opponent’s tactic is known, we know an adequate tactic against it. In other words, in the following, we will concentrate on the first problem only.

So, as just explained, in tactic recognition the goal is to recognise the bargaining tactic the opponent uses. This can be an easy or very difficult task depending on what information we have available and what tactics the opponents are allowed to use. In particular:

• The quality and amount of available information is essential. Obviously in competitive settings the opponents do not explicitly explain what they are doing, so their tactic must be deduced from their behaviour (that is from their previous offers and reactions to our offers (accept, reject or make a counteroffer)). In addition, the environmental variables describing the market situation may be used (if available). However, some of the factors that may affect the opponent’s tactic may not be available (for example his stock situation or the number of negotiations it is currently engaged in).

If these unknown factors dominate the tactic, it may well be impossible to guess what happens next. On the other hand, if the most important factors are available to all, tactic recognition may be possible.

• If the opponents are allowed to use only a small number of tactics, which are easily distinguishable (e.g. if there are only two possible time-dependent tactics with β = 10 and β = 0.1), tactic recognition is trivial. On the other hand, in completely open settings (any tactic allowed) the tactic can-not probably be estimated to a comfortable certainty, before the negotiation is over, if even then. In this latter case, we might be able to produce an educated guess as to what will happen next given the history of negotiation, identity of the opponent, other information we have, and our earlier expe-riences, but we can never be sure, since the opponent can change its mind and tactic at any given moment or use a tactic we have never seen before.

In this work, we are somewhere in between the two extremes in terms of both of these factors. Our consumer agents do not have a lot of information (requirement R6). We assume that parties do not know each other’s deadline but that they may know other parameters (for example quality and reservation price).

Now, there are two broad types of machine learning techniques that can be used to recognise an opponent’s tactic: classification and regression. In classification the opponent’s tactics are classified to one of a finite number of categories given their behaviour so far (and other relevant factors). The opponents in one group can then also be assumed to behave in a similar manner in the future, so that it might be possible to learn one tactic that works well against all of them. Obviously this approach is not optimal in all situations, but it should provide us with some idea about what will happen next. In the literature, classification has not been used to classify opponent tactics, but it has been used to estimate the opponent’s utility function in multi-issue negotiations (Bui et al. 1999;Chajewska et al. 2001;

Coehoorn and Jennings 2004)⁴⁹or the reservation price of the opponent (Zeng and Sycara 1998).

In regression, the goal is to estimate a value of some (continuous) variable or function (for example, the next offer, deadline or reservation price), given the events so far. It is usually assumed that once the regression yields good estimates, a good countertactic is also known. Unlike classification, regression has been used in tactic recognition in negotiation contexts. In particular, Mok and Sundarraj (2005) use regression to estimate the parameters of the time-dependent heuristic tactic the opponent is assumed to be using. After a reasonable degree of accuracy has been achieved, the optimal tactic against the estimated tactic is used. Hou (2004) also uses non-linear regression on the opponent’s previous offers to estimate the opponent’s deadline and reservation price, and then chooses an appropriate countertactic. Hou assumes that the opponent is using one of the heuristic tactics discussed in section2.2.1.2, but he restricts quite drastically the number of possible parameter values (for example there are only four different deadlines and four possible reservation prices). The agent tries to first classify the tactic the opponent is using and when it has a good idea of that, it tries to use regression to estimate the parameters.

However, in any realistic setting, the providers would be using a very wide range of negotiation tactics, some of which would be new or very rare and some of which would use information that is not available to us at all. Some of these tactics might also be very difficult to distinguish from some others. And in a completely open

49Bui et al.(1999) work in a cooperative environment and classification is used to predict other agents’ preferences in an effort to reduce communication. Chajewska et al. (2001) try to elicit the opponent’s utility function from the observed negotiation. The setting is competitive, so the estimates are used to increase the learning agent’s own utility. They have existing partial utility functions that are used as classes. In a similar vein, Coehoorn and Jennings(2004) use kernel density estimation to estimate the utility function of the opponent to make efficient multi-issue negotiation trade-offs.

environment, there is also nothing to stop our opponent from changing its tactic completely in the middle of the negotiation or employing random noise or any number of other methods to make tactic recognition an impossible task. Although we might be able to extend the work of Mok and Sundarraj and get reasonably good results in our restricted environment, such ‘perfect’ tactic recognition is not attainable in the general case.

Therefore, we contend that machine learning techniques would be very problem-atic in even a remotely realistic setting. We think that a more reasonable ‘coun-tertactic’ in such environments would be based on estimates of the opponent’s reservation price and other parameters, not on the offers the opponent makes, because single offers may not relay that much useful information in practice, only the fundamentals (the parameters) behind the tactics matter. Also, if we have enough opponents to choose from, we may not have to succeed in each and ev-ery negotiation but we may, for example, employ tactics that work against some opponents and fail miserably with others.

However, we do need some information to use any sophisticated strategies in the higher levels of our models. One good approach might be to set an offer to a certain level (based on actual values or estimates of quality or other attributes) and then use empirical data to estimate success probabilities in a given negotiation. In the perfect world, the buyer agent would be able to recognise the tactic the seller uses or even know it in advance (because of many earlier encounters). It would be interesting to see how well our approach will work in such cases and then we can remove or limit this information and see what effect that has.

So, in our setting, the sellers will have a small number of heuristic tactics and we have a countertactic for each of them. This countertactic is based on the characteristics of the seller (quality), the general structure of the tactic and maybe used against many types of tactic. However, the tactics will use randomness and/or behavioural aspects so that the regression models we just discussed are not applicable. We also will not make any use of any classification schemes (although they might sometimes be useful). Instead, we assume that sometimes the buyer might just know the tactic the opponent uses in advance (no learning takes place during the negotiation) or at least it knows the frequencies and types of negotiation tactics. This can be at least in some circumstances realistic. We hope to show that although information about opponent tactics may indeed be very useful, we can get quite good results even without such information. Of course our approach still requires that we have an idea what sort of tactics there are in the market and

their probabilities.⁵⁰ But we believe, this approach might well offer a reasonable approach to bilateral negotiations in open environments, although admittedly, we will be using it in a rather restricted setting.

In document Commitment models and concurrent bilateral negotiation strategies in dynamic service markets (Page 76-80)