Problem formulation - Planning Algorithms for Multi-Robot Active Perception

This section formally defines the active perception formulation considered in this chapter, as a subproblem of Problem 1.1. The objective is to plan the paths for a team of robots such that they maximally observe a set of nodes in the environment with varying rewards. Each robot has an associated travel speed and maximum travel budget. Each node may be observed by visiting any point in its associated viewpoint region, represented as a polygon. These nodes, viewpoint regions and rewards may be defined to meet the objectives of the application; in Sections 4.5 and 4.6 we formulate example problem instances for perceiving 3D point-cloud objects and incorporating exploration objectives. In this section we define the problem by considering the objectives that are currently known at a given time instance. Though we are interested in solving this problem in an online setting such that the plans adapt as new information is discovered, and a formulation for these scenarios is developed further in Section 4.6.

4.2.1 Multi-robot team

A team of R robots is denoted R = {r1_{, r}2_{, ..., r}R_}_{. The trajectory of each robot r}i is defined as a sequence of waypoints Xi _{= (x}i

1, xi2, xi3, ...), where each waypoint is

a position within a free space environment xi

j ∈ R2. Each robot ri moves along a straight line between waypoints at a constant speed si, which may be different for each robot. The cost of each robot’s path ci _≥_{0 is the time taken to travel through} the sequence of waypoints Xi_{. Each robot has a cost budget b}i _> _{0, and a set of} robot paths {Xi_}_{is deemed to be feasible if every robot meets its budget constraint,} i.e., ci _{≤ b}i_{, ∀r}i _{∈ R}_{. We address several possible conditions for the start and end} positions of the robots: (1) the start and end positions are unconstrained and free to be selected by the planner, (2) the start positions are unconstrained but the robots must end at their start position, and (3) the start and/or end position is fixed.

4.2.2 Viewpoint regions and rewards

The robots aim to observe a set of N nodes N = {n1_{, n}2_{, ..., n}N_}_{at different locations} within the environment. Every node has a weight wk _>_{0 that defines the reward for} observing the node. Each node nk _{has a continuous set of viewpoints Z}k_{defined as all} points on and within a simple polygon. The robot observes a node if any waypoint of the robot’s path is within the viewpoint region, i.e., ∃xi

j ∈ Xi : xij ∈ Zk. The binary indicator variable ok _{∈ {}0, 1} for each node nk is 1 if the node is observed by one or more robots and 0 otherwise. All robots sense continuously along their paths, which can be taken into account in the above definition by adding additional waypoints along a path at no extra cost. Although we assume the regions are the same for each robot, the algorithm can easily be extended to robot-dependent observations.

The presented solution is applicable for any set of viewpoint regions and rewards. In active perception formulations, these viewpoint regions can be used to represent locations where key points of interest may be observed. The associated reward may encode the importance of observing the point of interest. Later in Section 4.5.1 we present an example definition motivated by object recognition scenarios, but we emphasise that the presented solution algorithm is not specific to this example definition.

We are also particularly interested in formulations for online tasks, where the robots should aim to observe known objectives as well as discover currently unknown objectives. This balance can be achieved using our formulation by introducing new nodes to the set N that represent regions where the robots may be expected to discover new objectives. We formulate this concept further in Section 4.6.

4.2.3 Problem statement

The optimisation problem is to plan the locations of waypoints for each robot and the sequence the waypoints are visited Xi, such that all budget constraints are met and the sum of the observation rewards for the nodes is maximised. More formally, we state the problem addressed in this chapter as follows.

Problem 4.1 (Generalised orienteering problem). Find the set of paths {Xi_} _that maximises P

nk_∈N

okwk and satisfies the constraints ci ≤ bi_{, ∀r}i _{∈ R}_.

We are interested in solving Problem 4.1 by replanning in an online setting. The robots initially plan based on the information available offline. After each action is executed, an observation may result in a change of the nodes, viewpoint regions or rewards. We assume these changes are small and therefore the planner should efficiently adapt its previous solution to address the new objectives.

4.2.4 NP-hardness

The problem is NP-hard and a reduction from the orienteering problem (Vansteenwe- gen et al., 2011) with Euclidean costs can readily be shown by setting the number of robots R to 1 and the viewpoint sets {Zk_}_{as singleton. This result motivates the de-} velopment of a heuristic algorithm to approximately solve the problem in polynomial time.

In document Planning Algorithms for Multi-Robot Active Perception (Page 116-119)