Examples based on fusion of RF and image/video data

3.12 Data integration and fusion

3.13.1 Examples based on fusion of RF and image/video data

Most existing localisation methods are based on a single modality. In fact, to our best knowledge, even in other application domains there are only a few techniques based on fusion of RF and image sensing methods. One previous work [47], showed a proof of concept of how WLAN received signal strengths (RSSs) and image matching data could be fused to do coarse localisation for a small number of locations. Histogram similarity for RF and a hierarchical vocabulary tree for image-based localisation were used [47]. A simple fusion function was derived to take into account the strong points of both approaches. RSSI val- ues are not able to easily differentiate nearby locations. Image data is thus applied on the remaining locations, or if this data fails, the users motion priors restricts the location detec- tion. Average mean precision for image-based, RF-based and fused localisation approach were 82.09%, 77.42% and 88.26% respectively. The distance error rate was 3.24 and 2.02 meters for the image-based and WLAN-based localisation methods respectively.

As the sequel to that work, and as the basis of this thesis, a more precise WLAN-based algorithm [140], a different vocabulary tree concept for image-based localisation and a novel more complex and effective function in the fusion process were developed. The approach is verified on a much larger and more challenging dataset.

Multiple modalities have been used in the complementary but related challenge of tracking specific objects. In [67] the authors discuss an approach for actively tracking humans in

crowds using robots. It consists of 360◦ RFID system and video camera placed on remote directional and zoom control unit. The authors have developed a multisensor control strat- egy based algorithm for tracking using RFID data. A particle filtering method is used to fuse heterogeneous data to make the tracking more robust. Tracker outputs and RF data are used as a basis for the multisensor control platform (the RF tracking system is shown in figure 3.9). With the fixed dataset the average accuracy error in robot tracking was 0.8 meters.

Figure 3.9: Eight antennae addressed by a RF multiplexing prototype [67]

Other related work describes an approach for object tracking using a different particle filtering model [124]. It consists of a camera recording method based on color features of the target and a WiFi-based localisation system. Sensor fusion consists of video and WiFi data merged together for obtaining position and tracking. Due to WiFi performance and its RSSI characteristics the method can be utilized in both outdoor and indoor spaces. To track the targets seamlessly a particle filtering method that merges the two sensors is used. A WiFi observation model is involved in a video particle filtering approach to find the particular weights for every particle. It is proved that the fusion outperforms any of

the modalities separately and is useful when any of the modalities fails. In this system, the particle filtering observation model consists of two parts: one is a video based model that uses color features of the target, and the other is an approximated location system based on WiFi RSSI which access points transmit to PDAs. The method was compared with the ground-truth data showing maximal error distance of 18 meters. The reported precision was 68.63%.

There does exist work which uses fusion of three different sensors for three independent complementary modalities [181]. This system consists of an inertial sensor, positional sensor and visual sensor. Visual information is given by a video camera, acceleration is acquired using an accelerometer and the information about the position is obtained using an 802.11g receiver. These sensors are low cost and widely available. This ubiquitous platform repre- sents a typical example of a context-aware device that is able to give the real feel of a user environment through information sent to a user. To obtain reliable position and achieve least error distance three sensors are fused using a Discrete Kalman filter. In the case of a correct initial estimate of the system’s position, the average accuracy error does not go beyond 8.26 meters.

Another paper proposes an algorithm that fuses WiFi and video camera data for indoor localisation [164]. The algorithms differ to other solutions, by fusing the sensor data in the measurement model before calculating an estimated position based on the individual technologies. The purpose of fusing WiFi and video data is to have a smaller localisation error in the rooms where there is a camera, in contrast to only WiFi, but still offer room level localisation where there are no cameras. Data measured by the sensors are sent to a data aggregator (it stores the incoming sensor data). The aggregator selects which measurement models to use, a WiFi or image measurement model or both. The sensor data is then sent to a fusion engine where the particle filter algorithm is applied. The fused approach achieves error distance less or equal than 2 meters 67% of the time when a user walks around the test area without interference and less or equal than 4.3 meters 87% of the time with interference.

In the work presented in [133] a unified approach for a camera tracking system based on an error-state Kalman filter algorithm is presented. The filter uses relative (local) measurements obtained from image-based moving sensors to estimate change in position over time, as well as global measurements produced by landmark matching through a built visual database and range measurements obtained from RF ranging radios. The results of the work are shown by using the camera poses output by the system to render views from a 3D graphical model built upon the same coordinate system as the landmark database. The localisation distance error did not go below 2.46 meters with precision of 73.364%.

3.13.2 Hybrid localisation and tracking solutions based on RF and im-

In document Dual-sensor fusion for seamless indoor user localisation and tracking (Page 70-73)