Simultaneous-Location And Mapping (SLAM) - Visual Positioning Systems

2.4 Visual Positioning Systems

2.4.2 Simultaneous-Location And Mapping (SLAM)

SLAM is a relative positioning method that is designed to be used in completely unknown environments. It is in many ways considered the holy grail of robotic positioning and navigation due to its ability to accurately determine a full, relative state estimate for the vehicle with minimal drift[30]. Further, it can be implemented on a wide range of sensors, from basic ultra-sonic range finders[31] to cameras (known as Visual SLAM)[32] and LIDAR[33]. Finally, the approach provides not only positioning data but also a map of the environment that can be used for path planning, obstacle avoidance and other mission tasks.

The SLAM process has been discussed since the 1986, with Smith and Cheeseman’s paper[34] that developed a method for representing spatial uncertainty. SLAM has since been thoroughly reviewed and will thus only be discussed briefly in the literature review. Fundamentally, SLAM relies on the use of landmarks, specific features that can be detected in the scene and subsequently found and associated in later updates.

The first step of SLAM, location, uses these features to determine the vehicle pose in relation to the landmarks. If one momentarily, and naively, assumes that the landmark detection sensor is perfect then the relative movement of the features from one frame to the next is the inverse of the vehicle’s movement. Therefore, it is clear that one can use the movement of landmarks to determine the pose of the vehicle independently of the vehicle’s own sensors and therefore correct the internal navigation system.

The second step of SLAM, mapping, takes the opposite approach where the movement of the vehicle is assumed to be known with perfect certainty but the world is unknown. A mapping update is carried out by requesting a measurement from the landmark sensor. The vehicle then moves a known distance and attempts to detect the same landmarks. Due to uncertainties in the sensing process the detected landmarks will not appear in the same position but data association and fusion processes can be used to refine the location of features and gradually expand the map with new features.

be carried out independently due to the assumptions made in each one. In fact, the two processes are co-dependent; the output from one process feeds directly into the other and can potentially be used to iteratively refine the results of the other. However, SLAM goes one step further by carrying out the two steps simultaneously using a Kalman filter. This allows the designer of a SLAM system to model the vehicle, sensors and other uncertainties and have the Kalman filter[35] gradually refine its estimates of the feature locations as well as the vehicle’s states[36].

The Kalman filter provides an excellent solution to the SLAM problem but it also suffers from significant performance problems since each known feature in the world must be observed as a state in the filter. Thus, as more of the world is explored, the number of states increases and the number of calculation required for each update increases as a cubic function of landmarks. Since the Kalman filter relies on the inversion of a square matrix that consists of each feature the number of calculations increase non-linearly with feature count, causing the filter to rapidly increase in time to update.

There are various techniques to avoid this computational load. For example, the system might represent a history of states rather than explicit landmarks (known as delayed state SLAM [37]). This has the benefit that earlier frames can be per- manently commited and marginalised out of the filter. This manages the size of the filter but causes drift in the long term. Another alternative is to use an information filter information filters are mathematically identical to Kalman filters but use a canonical form to represent the state and covariances instead of the moment form used in the Kalman filter. This gives an information matrix, which is the inverse of the covariance matrix, and an information vector, information matrix multiplied by the state vector) and simplified maths. However, one of the primary computational benefits of the information filter is that the information matrix only needs to be inverted once for each update (during the prediction step), while Kalman filters require an inversion for each landmark that has been observed by the sensor during the correction step.

Finally, SLAM often uses bundle adjustment methods to carry out loop closures[20], allowing the system to correct itself. Bundle adjustments are data association methods that are carried out when the system revisits an area which has previously been

visited. Re-visiting an area allows the SLAM system to correct itself for drift by effectively recognizing well known features and improving the estimated states of system. While bundle adjustment is very useful to improve accuracy for systems that often traverse similar regions, such as a vehicle driving around a city[38], it is a very expensive data association method. A bundle adjustment is carried out by aligning the most recently observed features with all (or a subset of) the previous observation and finding the pose where the error between the two is minimised. This is a very computationally expensive methods, and it requires the SLAM system to maintain the landmark coordinates and covariances for the entire map.

For this reason bundle adjustments are often avoided. In a practical sense, a UAV flying a sortie is unlikely to pass over the same regions very often, which can lead to a very high computational overhead as bundle adjustments need to be attempted over a larger history of observed landmarks.

In document Vision-based Navigation Using Landmark Recognition for Unmanned Aerial Vehicles (Page 49-51)