5.3 Experimental Methodology
5.5.1 Heuristic Move Pruning
In this chapter, we experimented with heuristics in two general categories; heuristics that
drew statistics from the cards in the game, and heuristics that used heat maps to prioritise
focussing the search process on promising moves, and therefore maximise play strength.
Overall the first category proved the most effective, particularly the simplest heuristics that
merely totalled number of cards. The heat map heuristics were generally ineffective, how-
ever they did show the largest relative improvement from state-extrapolation, likely due to
the additional evaluation step providing some much needed additional guidance after the
heatmaps poor influence.
While state-extrapolation did have an effect upon playing strength in some cases, it
was only effective in improving agent strength in certain cases, most notably when using
heatmap heuristics h10 & h11. This is possibly due to the fact that placing cards around
the edges and/or the centre is strategically important (as our own experience would sug-
gest), however the strategic impact is not always immediately apparent from the game state
halfway through a player’s move decisions. As discussed earlier, the heat maps alone were
not expected to create strong agents, and the application of state extrapolation to h10& h11
may have revealed that placing around the edges or centre is a strong move, and thus that
these two maps are actually superior to the other heat maps.
5.5.2
Action Selection Mechanisms
The use of RobustRoulette selection mechanisms has a significant effect on the play strength
of the ISMCTS agents. This is likely due to the approach of averaging across multiple pos-
sible states, which in the case of traditional selection mechanisms would cause a preference
for more likely possible states, and potential ignorance of less likely states. Using a roulette
technique here allows simulation of a mixed strategy approach to the statistical informa-
tion presented by ISMCTS. Future work could be focussed on further exploration of these
In terms of creating an agent that was a more entertaining opponent, we created multiple
configurations that are worthy of future study, both playing a good game against our unmod-
ified ISMCTS agent and generating some interesting complexity features. The most notable
of these appears to be a RobustRoulette2n, which showed comparable play strength with
the RobustChild agent, and generated good complexity (with the exception of the instance
of RobustRoulette22 mentioned above.) Further research should also likely include a study
of human entertainment when playing against the agents, so as to determine actual human
entertainment, and also examine further our metrics for complexity and their effectiveness.
It is also worth noting that agents which had fewer than 500 iterations available to them
started to perform both extremely poorly, indicating the flaw in iteration budget scaling that
we proposed originally. These agents also demonstrated behaviour closer to the stronger
agents rather than the random agent, meaning their behaviour was also relatively simple
(and therefore not of interest). A far more gentle decline was provided by the agents with
the maximum available budget, but with various configuration options we displayed. In
this way, we believe that a well configured agent would likely create a more appropriate
opponent for a game requiring multiple difficulty settings.
The effect of online tuning appears to have been beneficial in scaling the modified agents
against opposing ISMCTS agents using differing iteration budget. This suggests that this
technique may be preferable to simply reducing iteration budget when balancing player
strength with an opponent.
5.5.3
Contributions
Our primary contribution from our work in this chapter is the demonstration that modifi-
cation of action selection mechanism and the application of heuristic pruning can create a
plexity, yet maintain roughly the same win rate against a specific opponent. This highlights
that agent behaviour in such games has many more dimensions than simple win percentage.
To further clarify, there already exist cases of heuristic pruning and action selection mech-
anisms, and our work here expands those considerably, but does not suppose to invent the
concepts themselves.
While the heuristics created are specific to the game Lords of War, then application of the
development of these heuristics, the manner in which they are applied (using a hard pruning
limit), and also the application of roll-forward (see section 5.3.4) and multi-heuristics are all
transferable to other domains. The heuristics will be different, but the same techniques can
be applied.
As with our previous work on parallelization, the techniques explored here could be
useful in a wide variety of MCTS applications where enhanced performance (both in terms
of playing strength and varied behaviour) would be an advantage. Pruning is a technique
for reducing decision space, and thus accelerating the search process. As such, it would
likely be applicable to similar domains as parallelization, assuming that appropriate pruning
heuristics could be found for the domain in question.
Action Selection Mechanisms could be employed in any environment in which a vari-
ety of effective decision making is required. For example, in the games industry, a MCTS
process with a modified action selection mechanism could create varied play in oppositional
agents. This work is also important to the games industry (and of our industry partner, Stain-
less Games), as it demonstrates a simple technique for modifying an artificial agent to create
different behaviour. This could be used in a game to simply create a variety of opponents
that played in different manners, and demonstrated different strengths and weaknesses while
maintaining a similar overall playing strength.
these techniques provide in terms of easy modification of play style are too valuable to be
disregarded.
I also see great potential in using the information generated by the MCTS process itself
during the decision process (e.g. determining whether a specific decision is easy or hard
Rule Association Mining for Opponent
Modelling in Games
A variety of games often include incomplete or hidden information as a form of challenge to
the players, indeed games such as poker would be somewhat more trivial if such an element
was excluded. Card games in which players bring decks of their own construction to play
are now relatively common-place, and are represented both in physical card gaming (e.g.
Magic: The Gathering1), and in digital gaming (e.g. Blizzard Entertainment’s Hearthstone2).
In such games, knowledge of the content of an opponent’s deck represents a potentially
powerful strategic advantage which can be exploited to significant advantage. This is true
of competition outside of the game domain also, as being able to adequately predict the
strategy of a potential competitor will likely give significant advantage.
In this chapter we consider a deck of cards to be a multiset consisting of a known number
of cards, each of which has a type identifier. We then use a variety of rule-mining techniques
applied with heuristic knowledge to attempt to predict the content of the deck after observing
a specific number of cards chosen at random. It is important to note also that our game of
choice is sufficiently complex, such that constructing a deck in the manner a human might
is substantially more difficult than prediction using any method we have attempted here.
fitting cards into the deck that either support that concept or appeal to the constructing
player in some other way. While our techniques here produce similar results, there is no
such central idea for each deck. Each deck is simply constructed from the association rules
without a guiding deck concept.
It is important to note that the ability to model an opponent’s actions exposes more
information to an adversarial agent, and thus allows such an agent to become a stronger
opponent. This applies to both traditional strength of play, and also to an agent which is
trying to entertain a human player, as more information on the actual game state allows more
informed choices, and thus more ability to make moves which affect human entertainment.
This research could also be applied outside the realm of games, to any similar, highly
complex, partially observable system with specific rules which govern the system construc-
tion. Optimising association rule mining to these complex requirements is clearly of interest
as a general advancement of research in this area. The techniques here could easily be con-
verted for use in other fields which have similar complex requirements on sets or multisets,
simply by applying heuristic knowledge to data mining and rule generation processes as
performed here.
This work was published in the paper “Using Association Rule Mining to Predict Oppo-
nent Deck Content in Android: Netrunner” appearing at IEEE CIG 2016 [121].