Reinforcement Learning - Machine Learning

2.4 Machine Learning

2.4.4 Reinforcement Learning

Reinforcement learning is described as learning how to map situations to actions so as to maximize a numerical reward [46]. An agent has a set of states and set of possible actions within each state. The agent then attempts to choose the set of actions that will provide the largest reward. The algorithm uses a trial and error approach to nav- igate through the environment. The agent must decide between traveling old paths (exploitation) to obtain a reward and exploring new paths that may lead to dier- ent rewards. While navigating through an environment a reinforcement agent must attempt to develop a policy that maximizes their reward [31]. Thus reinforcement learning provides a method of comparing the trade-o of long-term and short-term goals, while maintaining online performance.

Reinforcement learning is only provided input via the actions of an agent at a particular time, the response to this input is a cumulative reaction from current situational settings. The response to an action is never claried as being correct or incorrect, thus the player will expect that their action will produce the same response given that situational setting are the same until they have learned a possible new response.

Reinforcement learning is particularly useful to game designs because the training phase is done by matching player actions to the rules within the game world, which can be simulated to produce approximations to the actual response. Reinforcement learning does not require the correct input to learn which is an advantage over super- vised learning. As well, it is dicult to produce an adequate cost function to perform unsupervised learning.

Chapter 3 Related Work

This section reviews current research on improving user experience within video games. Our review begins with a general model of user interaction within virtual en- tertainment known as the User-System Experience (USE) model. Section 3.2 delves into a rened area compatible with the USE model, which is a framework for adaptive game systems (AGS). The AGS framework focuses specically on challenge and curiosity playability problems that were previously discussed in Section 2.1.6. Section 3.2.1 provides an introduction to the discussion and goals for auto-dynamic diculty. Section 3.4 introduces methods of adapting the game system from a perspective of playability. The nal section provides a detailed review of the three types of game adjustments: player characteristics, level design and non-playable characters.

3.1 User-System Experience (USE) Model

To address the issues of adaptive game systems, we require a model which explains user interaction with the system, with the goal of producing an enjoyable experience. The Person-Artefact-Task (PAT) model provides a framework which focuses on user-interaction with a system from the perspective of optimizing production from a person's work with an artefact. Since the major focus of the PAT model was to optimize production as opposed to experience, Cowley and Black [19] felt it inadequately described a gaming system which focuses more on playability rather than usability. Cowley and Black felt the presentation of Flow within the PAT model was inaccurate and could not be directly applied to games, in that it could not completely describe all experiences with a system. Thus, Cowley and Black adapted the PAT model to create the User-System Experience model

The USE model provides an overview of how the user interacts with the system and provides insight into opportunities for adaptation from a usability and playability perspective. The USE model is capable of describing multiple types of usage experiences such as when the participation is low yet the individual is still interested in the game they may experience a state of telepresence. Unlike other models, the USE models accounts for disinterest, participation, telepresence and variation in the level of Flow. The USE model seen in Figure 3.1 is composed of three main components: the internal state of the user, elements of gameplay system and the usage experience.

Figure 3.1: The USE model separates computer and game system interaction into three sections: the player, interaction with the game system and the experience pro- duced. Adapted from [14].

The internal state of the user can be dissected into three types of personal information; the rst type of information known as user typology deals with the personality and player type of the user. Information on user typologies such as Myers-Briggs and the DGD1, are found in Section 2.2, refers to the user's personality which includes their preferences for which to optimize experience. The second type of player information is physical characteristics, which are unique to each individual and must be measured during game play. Physical characteristics would ideally be initialized to values relating to population means and than adjusted accordingly. Finally the last

type of player information deals with prior system experience. This means the player has gained experience, knowledge or skills through playing other games.

The second component of the USE model is the game play system which is composed of two portions: Artefacts and In-App tools. Artefacts are external methods of communication between the player and the game, such as the game-pad or speak- ers. The In-App toolset is essentially the game, it contains the methods in which the player interacts and views the game world.

The third component of the USE model deals with user experience. The user experience is dened on two axes: the level of engagement from the player and the level of complexity of the task. At the lowest level of engagement the player is disinterested in playing the game, as the player's participation level increases they can experience telepresence, with increased participation they can experience a state of Flow. The Flow experience can take on two forms; soft Flow or hard Flow. Soft Flow occurs when the player has already mastered portions of the game, they're still engaged in game play but their experience is enhanced mostly via creating internalized challenges. Hard Flow occurs while the player is still highly involved in the learning process and challenges are still explicit and require a high level of player's skill.

The USE model illustrates user system interaction with the goal of optimizing experience, however it does not provide a detailed description of performing adaptation within the system. Thus the next section will introduce the required higher resolu- tion model to illustrate the process of adaptation of the system while optimizing the player's experience.

In document Game Challenge: A Factorial Analysis Approach (Page 60-64)