• No results found

5.3 Experimental Methodology

5.5.1 Heuristic Move Pruning

In this chapter, we experimented with heuristics in two general categories; heuristics that

drew statistics from the cards in the game, and heuristics that used heat maps to prioritise

focussing the search process on promising moves, and therefore maximise play strength.

Overall the first category proved the most effective, particularly the simplest heuristics that

merely totalled number of cards. The heat map heuristics were generally ineffective, how-

ever they did show the largest relative improvement from state-extrapolation, likely due to

the additional evaluation step providing some much needed additional guidance after the

heatmaps poor influence.

While state-extrapolation did have an effect upon playing strength in some cases, it

was only effective in improving agent strength in certain cases, most notably when using

heatmap heuristics h10 & h11. This is possibly due to the fact that placing cards around

the edges and/or the centre is strategically important (as our own experience would sug-

gest), however the strategic impact is not always immediately apparent from the game state

halfway through a player’s move decisions. As discussed earlier, the heat maps alone were

not expected to create strong agents, and the application of state extrapolation to h10& h11

may have revealed that placing around the edges or centre is a strong move, and thus that

these two maps are actually superior to the other heat maps.

5.5.2

Action Selection Mechanisms

The use of RobustRoulette selection mechanisms has a significant effect on the play strength

of the ISMCTS agents. This is likely due to the approach of averaging across multiple pos-

sible states, which in the case of traditional selection mechanisms would cause a preference

for more likely possible states, and potential ignorance of less likely states. Using a roulette

technique here allows simulation of a mixed strategy approach to the statistical informa-

tion presented by ISMCTS. Future work could be focussed on further exploration of these

In terms of creating an agent that was a more entertaining opponent, we created multiple

configurations that are worthy of future study, both playing a good game against our unmod-

ified ISMCTS agent and generating some interesting complexity features. The most notable

of these appears to be a RobustRoulette2n, which showed comparable play strength with

the RobustChild agent, and generated good complexity (with the exception of the instance

of RobustRoulette22 mentioned above.) Further research should also likely include a study

of human entertainment when playing against the agents, so as to determine actual human

entertainment, and also examine further our metrics for complexity and their effectiveness.

It is also worth noting that agents which had fewer than 500 iterations available to them

started to perform both extremely poorly, indicating the flaw in iteration budget scaling that

we proposed originally. These agents also demonstrated behaviour closer to the stronger

agents rather than the random agent, meaning their behaviour was also relatively simple

(and therefore not of interest). A far more gentle decline was provided by the agents with

the maximum available budget, but with various configuration options we displayed. In

this way, we believe that a well configured agent would likely create a more appropriate

opponent for a game requiring multiple difficulty settings.

The effect of online tuning appears to have been beneficial in scaling the modified agents

against opposing ISMCTS agents using differing iteration budget. This suggests that this

technique may be preferable to simply reducing iteration budget when balancing player

strength with an opponent.

5.5.3

Contributions

Our primary contribution from our work in this chapter is the demonstration that modifi-

cation of action selection mechanism and the application of heuristic pruning can create a

plexity, yet maintain roughly the same win rate against a specific opponent. This highlights

that agent behaviour in such games has many more dimensions than simple win percentage.

To further clarify, there already exist cases of heuristic pruning and action selection mech-

anisms, and our work here expands those considerably, but does not suppose to invent the

concepts themselves.

While the heuristics created are specific to the game Lords of War, then application of the

development of these heuristics, the manner in which they are applied (using a hard pruning

limit), and also the application of roll-forward (see section 5.3.4) and multi-heuristics are all

transferable to other domains. The heuristics will be different, but the same techniques can

be applied.

As with our previous work on parallelization, the techniques explored here could be

useful in a wide variety of MCTS applications where enhanced performance (both in terms

of playing strength and varied behaviour) would be an advantage. Pruning is a technique

for reducing decision space, and thus accelerating the search process. As such, it would

likely be applicable to similar domains as parallelization, assuming that appropriate pruning

heuristics could be found for the domain in question.

Action Selection Mechanisms could be employed in any environment in which a vari-

ety of effective decision making is required. For example, in the games industry, a MCTS

process with a modified action selection mechanism could create varied play in oppositional

agents. This work is also important to the games industry (and of our industry partner, Stain-

less Games), as it demonstrates a simple technique for modifying an artificial agent to create

different behaviour. This could be used in a game to simply create a variety of opponents

that played in different manners, and demonstrated different strengths and weaknesses while

maintaining a similar overall playing strength.

these techniques provide in terms of easy modification of play style are too valuable to be

disregarded.

I also see great potential in using the information generated by the MCTS process itself

during the decision process (e.g. determining whether a specific decision is easy or hard

Rule Association Mining for Opponent

Modelling in Games

A variety of games often include incomplete or hidden information as a form of challenge to

the players, indeed games such as poker would be somewhat more trivial if such an element

was excluded. Card games in which players bring decks of their own construction to play

are now relatively common-place, and are represented both in physical card gaming (e.g.

Magic: The Gathering1), and in digital gaming (e.g. Blizzard Entertainment’s Hearthstone2).

In such games, knowledge of the content of an opponent’s deck represents a potentially

powerful strategic advantage which can be exploited to significant advantage. This is true

of competition outside of the game domain also, as being able to adequately predict the

strategy of a potential competitor will likely give significant advantage.

In this chapter we consider a deck of cards to be a multiset consisting of a known number

of cards, each of which has a type identifier. We then use a variety of rule-mining techniques

applied with heuristic knowledge to attempt to predict the content of the deck after observing

a specific number of cards chosen at random. It is important to note also that our game of

choice is sufficiently complex, such that constructing a deck in the manner a human might

is substantially more difficult than prediction using any method we have attempted here.

fitting cards into the deck that either support that concept or appeal to the constructing

player in some other way. While our techniques here produce similar results, there is no

such central idea for each deck. Each deck is simply constructed from the association rules

without a guiding deck concept.

It is important to note that the ability to model an opponent’s actions exposes more

information to an adversarial agent, and thus allows such an agent to become a stronger

opponent. This applies to both traditional strength of play, and also to an agent which is

trying to entertain a human player, as more information on the actual game state allows more

informed choices, and thus more ability to make moves which affect human entertainment.

This research could also be applied outside the realm of games, to any similar, highly

complex, partially observable system with specific rules which govern the system construc-

tion. Optimising association rule mining to these complex requirements is clearly of interest

as a general advancement of research in this area. The techniques here could easily be con-

verted for use in other fields which have similar complex requirements on sets or multisets,

simply by applying heuristic knowledge to data mining and rule generation processes as

performed here.

This work was published in the paper “Using Association Rule Mining to Predict Oppo-

nent Deck Content in Android: Netrunner” appearing at IEEE CIG 2016 [121].

6.1

Experimental Methodology