3.2 Experimentation Hardware & Software
3.2.4 Engine Profiling
Initial profiling of the engine was performed using a simple maze search game. The player’s
piece was placed in the top left corner of the grid (0,0) and the target in the bottom right.
Variable sized grids were tested, and the total time for the player’s piece to reach the target
was tracked. Each grid size was tested for 200 games. The graph in figure 3.5 shows the
game completion time before the memory management system was implemented (woMM),
Figure 3.4: Class diagram showing the Memory Management classes
sation (wMM2).
We can see the substantial effects of the memory management system upon game com-
pletion time, particularly after second stage optimisation of the C++ code. The original
unoptimised system completed a 7x7 game in approximately 62 seconds, whereas our opti-
mised system completed the same game in approximately 5 seconds, representing a twelve-
3.3
Summary
In this section we have provided a summary of all games relevant to our work here, accom-
panied by references to their various rulebooks which allow the reader to further educate
themselves on those games. We also outline our work on a C++ MCTS engine, and the
0 10 000 20 000 30 000 40 00 0 50 000 60 000 70 000 2 3 4 5 6 7 8 Game Co mp letion Ti me (m s) Gr id Hei gh t/ Wi d th Me m o ry M an agem en t & Fu rth er Op timi sat ion Me m ory Man age m en t N o M em o ry Ma n age m e n t
Figure 3.5: Simple engine profiling test results, using the SimpleGrid game and varying memory management techniques. Horizontal axis shows the size of the grid used, vertical axis shows the game completion time.
Parallelization of Information Set MCTS
Whilst traditional computer software was written for serial computation, the vast majority
of modern computers, games consoles and even mobile devices have multi-core processors.
This means that parallel computing is an essential concept to getting the most that this hard-
ware can offer. As time is a critical factor, it follows that algorithms using multiple parallel
threads of execution are required to use these processors to their full potential. MCTS is
readily adapted to parallel execution, with several methods having been proposed [33]. The
three proposed methods described below were compared on MCTS (UCT) by Chaslot et
al. [37]. Our work here confirms the original result and expands that comparison to ISM-
CTS.
As the time assigned to an MCTS process increases, more MCTS iterations are per-
formed, and thus more information on the decision space is gathered, and this is likely to
have an improvement upon the search result. There are problems with providing a large
amount of processor time to an MCTS process, most notably memory limitations which
will become apparent with any reasonably complex game. There will also come a point in
the search where additional time results in diminishing returns (i.e. the final decision will
not be changed by these additional iterations), as no stronger options are identified by the
ongoing search.
where the full state is observable to all players at all times and moves are deterministic, non-
simultaneous and visible to all players. More recent work has applied MCTS to games of
imperfect information. Generally this means games with information asymmetry, i.e. games
where parts of the state are hidden and different parts are hidden from different players. The
class of imperfect information games also includes those with chance events, simultaneous
moves or partially observable moves. This chapter focusses on Information Set MCTS (ISM-
CTS)[143, 44]. ISMCTS works similarly to regular MCTS, but each simulated playout of
the game uses a different determinization (a state, sampled at random, which is consistent
with the observed game state and hence could conceivably be the actual state of the game).
More detail on ISMCTS appears in chapter 2.
Previous work on ISMCTS has focussed solely on the single-threaded version of the
algorithm. This chapter applies parallelization techniques for perfect information MCTS to
ISMCTS. Some parallelization techniques involve multiple threads searching the same tree,
in which case it is necessary to use synchronisation mechanisms such as locks/mutexes to
ensure multiple threads do not update the same part of the tree simultaneously. If threads
spend most of their time waiting for mutexes to be unlocked, the efficiency of the algo-
rithm is diminished. Games of imperfect information tend to have a larger branching factor
than games of perfect information, particularly at opponent nodes. Furthermore, the deter-
minizations in ISMCTS restrict each iteration to a different sub-tree of the overall search
tree, reducing the likelihood that two threads will attempt to take the same branch simulta-
neously. From this we suggest that threads in parallel ISMCTS will spend different amounts
of time waiting on mutexes than in the perfect information case, and the relative efficiency
of tree parallelization will be different. Our measure of efficiency is two-fold. Firstly, we
measure the amount of time taken by the complete MCTS process when assigned a specific
Our focus on parallelization was specified by our sponsor, Stainless Games. However
our focus changed as discussed in chapter 1, and this work became less of a priority outside
of our first year of work.
The work here focuses specifically on the efficiency of the parallelization techniques
(i.e. the number of iterations within a given time budget), which means that optimality of
decision is not fully explored here. This work was published in the paper “Parallelization of
Information Set Monte Carlo tree search” appearing at IEEE CEC 2014 [123].