Using Graph-Matching to Infer Strategy in Multi-Agent Multi-Team

For SiMAMT, the lowest level of consideration is the action. For each agent those are enumerated (e.g., move, cover, shoot, observe) by their behavior model. The behavior model informs the agent which actions to take (keeping the short-term goals in view) from a particular state. The behavior model is setup by the policy assigned to that agent by their strategy. The strategy model assigns the policies to each member of the team.

The behavior gives the progression / regression of moves by position according to their related probabilities as defined by the behavior’s MDD. Each MDD contains a series of moves. Each move has a position with both a probability of choosing that position next and a speed with which the player will move to that location. For example, a move may say that a player will move to position 5 with a probability of 0.25 and a speed of 0.75 or position 6 with a probability of 0.75 and a speed of 0.5 (speed is expressed as a percentage of the maximum speed of the player). This is the probability of the moves based on the behavior, but this does not mean that the player will move, only that if they do this is the probability of them moving to either of these locations. The impetus to move is controlled by both the local policy settings and the overall strategy (detailed below). Again, we see here the individuality of the agent as viewed through the lens of the policy through the behavior because the speed of the move and probability of the move are both modified by the player’s own speed and their willingness to move. These moves are chained together in probability clusters to form the bi-directional movement pathway that forms the core of each behavior. It is bi-directional so that each current move knows both the next moves and the

previous ones along the chain. With the information encoded in the behavior model and each policy having decision-making factors and a movement pathway, the policy can be incorporated into a strategy.

Figure 5.20: SiMAMT Framework: SIE

There are several working elements within the SIE, as shown in Figure 5.20. The first stage is comprised of finite state automaton (FSA) models that encapsulate the various strategies that the system is aware of (plus one more that is being built and modified in real-time to account for a strategy that the system has not seen - the

n+ 1 model inherent in the system). The graph matching algorithm compares these FSA models with the data being generated by the simulation through the agent’s observations to determine how well the observed actions match with the individual strategy models. Figure 5.21 shows a sample of these types of observations based on the Experiments scenario of 5-vs-5 Professional Speedball Paintball. The agent, denoted by the diamond shape, is located at position 306 looking around the right side of the obstacle towards position 509. The lighter colored enemy agents are observed (near position 514 and 310), while the other two enemy agents are not observed (near

location 309 and 510). In this instance, the agent would collect two observations, namely that there are two agents currently at those locations, and pass this along to the inference engine. It should be noted that the agent may or may not know which enemy agents these two are. They may be identifiable or not. In either case the inference engine can still makes it inferences, though if it can note which agents it is observing that helps the inference process. As a side note, as will be discussed in the Experiments section, the agents in this simulation are unlabeled.

Figure 5.21: Agent Observation Example

This data is collected into the belief network for final aggregation and analysis and forwarded from the belief network. The belief network is holding a single model for each known strategy, behavior, or any other element at that level. For example, if the current level is the strategy level and there are 4 known strategies then there will be 5 models in the belief network (1 for each of the known strategies and 1 more for the model being built as the simulation runs). As the matching progresses, as

detailed below, each node in the belief network constantly updates its belief for each model that that model is the best match to that one being used by the opposing team. The belief is essentially a percentage match of the observed elements with the known elements for each model.

(a) Speedball Paintball (PSP) Field

(b) Movements for PSP

Figure 5.18: Professional Speedball Paint- ball MDD

For illustration, the diagrams in Figure 5.22 show a sample strategy,

σ3, and the corresponding underlying behaviors (derived from each policy) that are being used by an opposing team that is utilizing this strategy. As the current team is making observations of the field they are noting the locations and movements of the players on the other team. With more and more observations there is more data for the SIE to use to update the belief network. In this example, the top figure shows that strategy with no observations. Each subsequent figure shows the progress of this node of the belief network as more

observations are made. The result is the data shown in each subfigure where the beliefs of each individual behavior being observed are being aggregated into the total

belief that this particular strategy is the one being observed. This output is then sent to the Evaluation Engine for decision making and factoring.

In document Geochemical Signatures of Stream Capture in the Retreating Blue Ridge Escarpment, Southern Appalachian Mountains (Page 188-193)