Chapter 3 | Multi Agent Systems
3.1 Agent Technology
This section starts by defining agents and provides a brief introduction to agent
technologies. The field of Artificial Intelligence (AI) has rapidly evolved over recent years across its various disciplines (Karaboga et al., 2014; Luxton, 2015; Millington& Funge, 2016;
Weiss, 2013; Wooldridge, 2008; Zhang & Zhou, 2014). Agent technology is one of the research disciplines that was of interest to this thesis. Various definitions exist that define an agent, explicitly based upon the role or function that an agent performs in a problem domain (Carnmarata, McArthur & Steeb, 2014; Franklin & Graesser, 1996; Spiro, Bruce &
Brewer, 2017; Wooldridge, 2008). In the literature, specific reference is made to the concept of an autonomous agent, which is of importance to the CESIMAS model. An autonomous
37 | P a g e agent can be defined as a computational entity that operates in a problem domain
(Johnson, Gratch & DeVault, 2017; Kerr & Szelke, 2016; Russell & Norvig, 2010; van Niekerk
& Ehlers, 2016). An Agent can perceive its environment and perform actions to affect changes in the environment (Johnson et al., 2017; van Niekerk & Ehlers, 2016; Weiss, 2013;
Wooldridge, 2008) based upon the goals instilled in the agents (Carnmarata et al, 2014;
Franklin & Graesser, 1996; Karaboga et al., 2014; Luxton, 2015; Millington & Funge, 2016;
Zhang & Zhou, 2014).
At a very high level, agents perceive their environment through percepts, and perform deliberation activities and execute actions within the environment through actuators, as depicted in Figure 3.1 (Bear & Rand, 2016; Russell & Norvig, 2010; Karaboga et al., 2014;
Velmovitsky, Briot, Viana & de Lucena, 2017). More often than not a typical “utility” driven agent will have some form of deliberation process, referred to as a utility function (Al-Ayyoub, Jararweh, Daraghmeh & Althebyan, 2015; Bear & Rand, 2016; Velmovitsky et al., 2017). The purpose of a utility function is to enable an agent to determine if a chosen action will yield an acceptable utility or state (Bear & Rand, 2016; Fan, Yang & Zhang, 2016;
McHugh, Yammarino, Dionne, Serban, Sayama & Chatterjee, 2016; Stummer, Kiesling, Günther & Vetschera, 2015). Figure 3.1 depicts the basic structure of a utility agent.
Figure 3.1: Basic structure of a Utility Agent (Russell & Norvig, 2010)
The internalisation of a utility measure, often referred to as a performance measure (Colson
& Nehrir, 2013; McHugh et al., 2016; Weiss, 2013), enables an agent to consistently act
38 | P a g e rationally. From an agent perspective, rationality enables an agent to make decisions which will yield the highest possible utility, given a set of percepts, a basic understanding of the state of the environment, and the possible actions which can be executed (Colson & Nehrir, 2013; McHugh et al., 2016; Lesser et al., 2012; Weiss, 2013). The extent to which utility agents can meet their expected utility relies on the dynamic nature of the environment in which they operate and the level of autonomy that they rely on. Autonomy refers to an agent’s ability to make decisions based on incomplete and inconsistent information
(Hexmoor, Castelfranchi & Falcone, 2012; Johnson et al., 2017; McHugh et al., 2016; Russell, 2016). In order to potentially achieve rational autonomy, an agent requires the ability to learn. Learning can be achieved through the implementation of a direct-control-feedback loop, as shown in Figure 2.4. An agent should be able to perceive an environment,
deliberate on a decision, execute the decision and then learn from the outcome, all of which should occur autonomously.
The agent characteristics described so far seem reasonably easy to implement in an
environment where the state of such an environment will always be known. Unfortunately, it is not as easy when the operational environment is of a dynamic nature (Kowalski & Sadri, 1999; McHugh et al., 2016; Russell, 2016; Weiss, 2013; Wooldridge, 2008). This potential problem can be further complicated if external variables are in play. CIIP is one such example. As discussed in Chapter 2, CII operate in a dynamic environment where the only constant element is change. For an agent to operate within such a dynamic environment (such as CIIP), they require a set of capabilities. These capabilities include the ability to internalise utility, a performance measure function to measure the level of utility which consists of the ability to deliberate, and most importantly, the ability to learn from outcomes achieved based on decisions made.
So far, section 3.1 has discussed basic agent characteristics and capabilities that are useful if an operational environment is isolated from external inputs and can only have a defined number of states. Unfortunately, CIIP is not such an environment. Vulnerabilities are
identified and exploited on a continual basis by both internal and external sources. If agents were deployed to the problem domain of CIIP, they would have to be instilled with a level of intelligence that would enable them to act autonomously and make intelligent decisions on a continual basis.
At a very high level, agent intelligence can be defined as making a decision which yields the highest utility (Bringsjord & Govindarajulu, 2012; McHugh et al., 2016; Russell, 2016;
Sternberg & Kaufman, 2013). Although this is a somewhat fundamental description, an agent has to have access to some form of a knowledge repository to utilise its “memory”
during the deliberation process. Intelligent agents should be able to learn from previous decisions to improve future decisions and the state of utility that can be obtained (Esteban
& Insua, 2017; Sternberg & Kaufman, 2013; Weiss, 2013; Wooldridge, 2008).
39 | P a g e Dynamic environments such as CIIP require more than one agent to be deployed
concurrently within the environment. Section 3.2 discusses Multi Agent Systems (MASs), what they encompass, and some of the desirable characteristics relevant to the problem domain of CIIP.