• No results found

1.3 Thesis Scope

1.3.3 Problem settings

In this section we introduce and motivate the sub-problems of Problem 1.1 considered in this thesis and key characteristics of the problem settings.

Online planning and adaptivity

Online algorithms produce sequences of decisions based on historical data while also considering the impact of these decision on the final quality of overall perfor- mance (Karp, 1992; Borodin and El-Yaniv, 1998). This type of decision making is relevant to settings where there is uncertainty as to what will happen in the future, such as in memory caching and network routing. Robotics generally, including the problems considered in this thesis, also falls into this category as planning decisions need to be made with respect to uncertain estimates of the environment, observations, team behaviour, etc.

One way to address online problem settings is to develop adaptive algorithms, such that each planned action is a function of the most recent observation (Hollinger et al., 2013). Solutions of this form are represented as a policy tree that branches on observations. This formulation is a general representation of sequential decision processes and, if solved optimally, guarantees the best possible performance. However, these benefits come with the cost of requiring significant computational resources, which may not be available onboard robots. Additionally it requires specifying the set of all possible observations in advance, which is often unknown or prohibitively large. Dec-POMDP solutions typically take this approach (Oliehoek and Amato, 2016).

In contrast, non-adaptive algorithms plan a fixed sequence of actions that are in- tended to be executed no matter which observations are received. This simplified solution typically enables using more efficient solution algorithms. It is common to then replan online whenever new observations are received that result in changes to the belief of the world (i.e., g changes). The performance gap between an optimal adaptive algorithm and an optimal non-adaptive algorithm is referred to as the adap- tivity gap; in certain classes of information gathering problems, this adaptivity gap is bounded (Hollinger et al., 2013).

In this thesis we propose non-adaptive algorithms with replanning when new obser- vations are received. We feel this is the most suitable approach for these problem settings since our intention is for the algorithms to be computed onboard the robots; however, we acknowledge this debate of adaptive versus non-adaptive algorithms is a contentious issue in the planning community.

Decentralised and centralised coordination

In Chapters 3 and 6, we address formulations of Problem 1.1 in decentralised settings. We broadly define these decentralised settings as follows. We assume each robot r knows the global objective function g(x), but does not know the actions x(r) selected

by the other robots. We assume that robots can communicate during planning-time to improve coordination. The communication channel may be unpredictable and intermittent, and all communication is asynchronous. Therefore, each robot will plan based on the information it has available locally. Bandwidth may be constrained and therefore message sizes should remain small, even as the plans grow. Although we do not consider explicitly planning to maintain communication connectivity, this may be encoded in the objective function g(x) if a reliable communication model is available. In Chapter 4, we address a formulation of Problem 1.1 in a centralised setting. This setting assumes there is a single server that plans on behalf of all robots, which is a reasonable assumption in some contexts. The server has the advantage of having full control over what plan xr all robots will execute, meaning it can plan over the space of joint action sequences x without requiring communication.

plans with respect to the other robot (the target). We assume that the target has already created its plan, and then present a single-robot algorithm for the tracker that considers probabilistic predictions of how the target may execute its plan. This type of planning is referred to as decoupled planning, and is often useful in applications where there is a hierarchy of importance for individual tasks of the robots, as in Chapter 5. In Chapter 6, this particular problem is generalised to a multi-tracker setting and solved in a decentralised manner.

Informative viewpoint regions

The problem we formulate and address in Chapter 4 aims to capture the viewpoint- dependency of observation rewards in an efficient manner. Most approaches for active perception, including one of the examples provided in Chapter 3, typically estimate the value of visiting candidate viewpoints by simulating predicted observations (van Hoof et al., 2014; Wu et al., 2015; Patten et al., 2016). For complex sensor models, these predictions can be computationally expensive, which therefore restricts the capabilities of planning algorithms.

Instead, in Chapter 4 we formulate perception tasks by extracting informative fea- tures of the scene to be observed. This is defined using an inverse sensor model that generates a discrete set of overlapping continuous viewpoint regions, with associated rewards, where each feature can be observed. This problem can be thought of as a new generalisation of the orienteering problem (Vansteenwegen et al., 2011; Gu- nawan et al., 2016). Figure 1.4 illustrates an example outdoor scene that has been decomposed into a problem of this form. One advantage of this formulation is that it allows us to develop efficient non-myopic planners that exploit characteristics of this formulation to efficiently plan over continuous space.

Mission monitoring

In Chapters 5 and 6 we define and address the mission monitoring problem. Mission monitoring is a supervisory problem where one or more robots or manually driven ve- hicles track the progress of an autonomous mobile robot or other agent in performing a pre-planned task. There are many examples of such tasks that require monitoring,

Figure 1.4 – Chapter 4 active perception problem formulation. Illustration

of the motivating active perception problem. Each object segment (point clouds) is observed by visiting the viewpoint regions (circle segments). Grey cylinders represent positions of two robots. The currently visited viewpoint regions are drawn in bold. Black lines represent the path plans. The aim is to collectively maximise the weighted sum of viewpoint regions visited by the robots.

including undersea surveys, environmental monitoring, autonomous farming and plan- etary exploration. Monitoring allows for rapid response to failures and to important information that the robot may discover during the progress of its mission (German et al., 2012; Hagen et al., 2008; Yilmaz et al., 2008; Khatib et al., 2016). Additionally, the monitoring vehicle may augment mission capabilities by providing observations from external viewpoints, such as for accurate localisation and navigation (Fallon et al., 2010; Heppner et al., 2013; Klodt et al., 2015; Saska et al., 2014; Kottege and Zimmer, 2011) or online sensor calibration (Bongiorno et al., 2013). The motion of the robot is typically represented by a mission plan, which may be defined proba- bilistically to take into account uncertain vehicle dynamics, environment models and mission objectives (Karydis et al., 2015; Chiang et al., 2014; Aoude et al., 2013). We consider the case where the monitor vehicles must remain stationary in order to observe or communicate with the robot, which is motivated by marine robotics prac- tices where communication equipment is most efficient while stationary. This problem is important because it is an essential part of employing outdoor robots for certain real-world tasks, such as various underwater missions (German et al., 2012), that

Figure 1.5 – The multi-tracker mission monitoring problem (Chapter 6). A

probabilistic prediction model for a robot trajectory (30 min AUV mission) is shown as blue sample trajectories moving upwards through time. A plan for a tracker team (3 surface vessels) is shown in black. Cylinders represent probabilistic monitoring regions at stopping locations. The objective can be interpreted geometrically as maximising the expected overlap between the cylinders and the prediction model.

depend on timely transmission of sensor observations or system faults. It is also in- teresting in broader contexts because it applies to systems that must stop periodically to conserve energy (Brockers et al., 2011), to provide imagery taken from stationary viewpoints (Naseer et al., 2013), and for acoustically covert surveillance (Dunbabin and Tews, 2012).

A geometric interpretation of mission monitoring for the case where there is multiple monitoring vehicles problem is shown in Figure 1.5. The optimisation problem is for the monitor vehicles to decide where to stop (centre of cylinders), and when to move to the next observation location (height of cylinders), in order to best observe the probabilistic prediction model (blue lines). In Chapter 5 we formulate and address the case where there is one monitoring vehicle, and in Chapter 6 we generalise this problem for cases where there are multiple monitoring vehicles that coordinate. We propose algorithms that exploit geometric characteristics of these problems.