6.2 Future Work
6.2.3 Surrogate-Assisted Evolutionary Algorithms
Based on the results for SAPEO in section 5.2.1.3, we have concluded that model val- idation is definitely one aspect to consider further. It seems that even stricter model validation might further improve the performance. In this context, it should be invest-
igated where the behaviour switches and SAPEO always resorts to falling back on the underlying algorithms, as in these cases, no improvements are made either.
An additional result from the analysis is that potentially convergence detection mechanisms, especially in CMA-ES, could be able to detect whether the number of selection errors is large. If this hypothesis can be verified, this observation could be used to either restart SAPEO, or to adapt the strictness of model validation.
As many game optimisation problems also include mixed-integer search spaces (see section 6.2.2), it would also be interesting to investigate how SAPEO performs in conjunction with other surrogate models. One potential candidate are bandit models, which require only minor assumptions and work well for problems with small search spaces, but noisy fitness functions [75].
Furthermore, it should be investigated why the surrogate-assisted algorithm tested in our experiments performed mostly below our expectations. It is possible that these algorithms were only intended for a small subset of problems with very low budgets and specific properties. If that is the case, future work could be to find out whether performance can be improved overall. If this is not the case, additional implementations of these algorithms should be tested. One algorithm that should definitely be run is EGO with full global models.
A
P
P
E
A
GAME
EVALUATION
SURVEY
In the following, we first identify relevant areas of research in the field of Artificial and Computational Intelligence in Games. We then survey work from these areas and classify the described approaches according to our taxonomy as described in section 4.1. The presented work is grouped by type of game (content) that is evaluated. We hope to identify dominant and unexplored methods using this structure. This analysis will be visually supported by displaying the publications in tables based on our taxonomy (cf. Tab. 4.1).
A.1
Characterisation of Game Evaluation AIs
All the arrows in Fig. 4.1 describe an information processing step which can be executed by a human or an AI. In case of automatic processing as addressed in this survey, all steps need to be executed by an AI. The employed AIs can be classified using the taxonomy presented in [160]. This is done in the following, in order to identify areas with relevant literature.
So in terms of the End User (Human) Perspective according to the paper, the paths
PLAYER,COMPandSTAT(blue arrows) all model player behaviour.PLAYER is inten- ded to predict the actions of a player within the game context, whileSTATmodels the
behaviour of a whole group of players in terms of gameplay statistics. In contrast,COMP
models player behaviour in the sense that it aggregates gameplay data into statistics (e.g.average final score), thus potentially biasing it by selecting specific statistics.
The processes depicted in red, i.e.CODE,OUTandPLAY, all describe an evaluation
of content in terms of the End User (Human) Perspective. In all cases, the game or
content is evaluated in terms of a goal that is defined a priori. WhileCODEuses a direct evaluation based on an encoding of the content, PLAY and OUT evaluate the content based on further data that is generated from it.
The intended end user of game evaluation is mainly the game designer, but of course producers/publishers are indirectly affected as well. Depending on how the evaluation results are used, researchers have a stake in game evaluation as well.
In case of the red arrows, it is very clear that the methods employed here fall into the research area of player modelling.PLAYER, however, describes some form of player AI which relates to research in nonplayer character (NPC) behaviour learning and search and planning, depending on the game in question. General game AI also ties into this process, as the AI generating playtraces should be as general as possible in order to deal with different levels and rulesets equally well. Additionally, as the AI in case ofPLAYER
Table A.1: Publications applying game (content) evaluation to grid-based games. Research on platformers is displayed in blue, on dungeons in green and on general arcade games in red. ❳ ❳ ❳ ❳ ❳ ❳ ❳ ❳ ❳ ❳ ❳ ❳ input feedback none NONE implicit IMP explicit EXP encoding CODE [58,92,121,122,129,130] [82][26,88] [5] [117] outcome statistics OUT [63] [26,59,97,98,135] gameplay data PLAY [46,126][125][84] [119]
serves as a stand-in for a human playtester, research in believable agents is relevant here as well.
COMP is the selection of statistics that characterise a playtrace appropriately and is thus most related to player modelling. Similarly, while there are not currently many examples of methods that follow STAT, they could be realised using machine learning
methods and would then most likely fall under player modelling as well.
Finally, research on AI-assisted game design naturally includes various forms of game evaluation. The fields of procedural content generation and computational narrative do not relate directly to any of the AIs in the figure. However, publications using search- based algorithms in both fields often employ processes that are visualised in Fig. 4.1 in order to evaluate the generated content and are thus relevant as well [116, ch. 2].