Navigation - Incremental Evolutionary Methods for Automatic Programming of Robot Controllers

The spatial characteristics of an agent and its environment influence the strategy for selecting and performing actions in order to move around the environment and achieve the agent’s goals: the agent navigates in its environment. These strategies, or navigation algorithms, form a separate research subarea. From simple maze- exploration strategies such as wall-following, and left/right-hand rule, to complex stochastic strategies intertwined with map-building, localization, and exploration tasks.

The navigation strategy is deterministic, when the agent always chooses the same action in the same situation, and it is stochastic when the agent actions are chosen randomly (at least include some degree of randomness).

Probably the most simple stochastic strategy is random movement used for environment exploration or area-cover. The robot moves for some distance along a straight line, turning randomly, bouncing or turning randomly on the area boundaries and obstacles. A nice example is one of the first autonomous lawn mower robots built by Husqvarna [Hicks II and Hall, 2000], which moves randomly on a lawn surrounded by inductive wire dug few centimeters under the ground. Such behavior results in virtually all lawn of an arbitrary shape mowed without the need of specific deterministic strategy. The cost of such a solution is a lower efficiency. However, given the robot being powered from the solar panels, this becomes a less important issue, and (as the feedback from customers suggests) it gives some entertainment value to the robot.

A simple deterministic strategy for locating a target at unknown location is the depth-first search. If the location of target and the map of the environment is known, a simple shortest-path algorithm can be used.

An interesting class of navigation algorithms deals with avoiding obstacles and constructing a smooth trajectory of a robot without complicated equations. In a 2D environment, a potential-fields map is constructed. Each obstacle is a source of a repulsive force vector, whereas the goal is a source of an attractive force. A composition of the force vectors in each point results in a vector of the direction of robot movement in that point. Increasing the repulsive force close to the obstacles guarantees they will be avoided, while the attractive force of the target guarantees the goal will be reached. An example of such a potential-field map is shown in Figure 2.6.

A crucial role in most higher-level navigation algorithms play the landmarks. Landmarks (according to [Nehmzow, 2000]) are objects or signs that should be

2.6 Navigation 27

Figure 2.6: A motor schema for 2D environment with 4 obstacles generated according to [Arkin, 1998] using [URL - Schemas]. The robot follows the direction of the vectors in the vector field, which is a composition of attractive force towards the target and repulsive forces from the obstacles. Motor schemas are not immune to local minima and cyclic behavior: there are locations where the robot can stall at one point, or even areas which may lead to such points.

• Recognizable under different light conditions, viewing angles, etc.;

• Either stationary throughout the period of navigation, or its motion must be known to the navigation mechanism.

The landmark appearance should preferably provide some unique navigational information (at least when combined with other sources of navigational information). For instance, a same kind of post on top of each hill will bear no information, while a uniquely shaped TV-tower would provide a useful landmark.

In addition to local landmarks found at various locations in the environment, global landmarks – such as the Sun, stars, or stationary satellites are very useful, and biological organisms take benefit from most of them.

An important theory for a class of navigational and planning algorithms are Markov Decision Problems (MDPs). MDPs are extended finite-state automata, where the transitions between states occur with certain probabilities, asserting that the probabilities of transitions in each state depend only on that state (Markovian assumption). Such a stochastic model allows for modeling the environment, sensory readings and outcome of actuator actions when these are not deterministic. States correspond to locations in the environment represented as grid-based or topological map. Alternately, states can correspond to the states of a dynamic environment, task completion progress, or the planning strategy states of an agent (for instance when modeling a behavior of an animal). In some of the states, the agent can receive

positive or negative reward. The problem is to find a good policy for traversing the state automaton so that the reward is achieved with the highest probability. MDPs are thus closely related to the field of Reinforcement Learning, a method for learning an action-selection policy to achieve agent’s goal.

Navigational algorithms often utilize the sensors for the feedback about the robot movements (this is referred to as local navigation in the literature). For instance, rotation sensors can provide information about the speed of spinning of the wheels for odometry. Using dead-reckoning, the agent estimates its location based on its own measurements of the wheels revolutions. This information can alternately be obtained or supported also using distance sensors, compass, landmark detection, or vision.

Once the robot knows how much it travels, it can possibly try to locate itself within a map of the environment or try to follow, or even construct such a map (this is referred to as global navigation in the literature). An example of a global navigation algorithm used by robot Xavier [Koenig and Simmons, 1998] for pose estimation in an office environment is based on the theory of Partially Observable Markov Decision Problem (POMDP). The environment is divided into locations (states), and at each time, the robot resides at each location with a determined probability. Given the sensor and motion report, and the desired directive applied to the actuators, the probability of being at each location in the next discrete step is computed from the prior and learned model of the environment.

In real robot implementations, navigation usually utilizes a combination of multiple sensory inputs (sensor fusion). For example, in [Thrun et al., 1998], the output from sonar sensors which detect the presence of obstacles is supported by scene analysis from stereo-vision. Thrun et al. demonstrate how the sonar sensors alone tend to overlook objects absorbing sound, while the vision system itself misses obstacles, which are not distinguished by their optical properties – such as glass doors, or white walls.

In document Incremental Evolutionary Methods for Automatic Programming of Robot Controllers (Page 32-34)