Real Data Analysis - Kalman Filter - Computationally Efficient Vision based Robot Control

3.4 Kalman Filter

3.4.4 Real Data Analysis

With the parameters and additional techniques correctly implemented in the simulation environment, the next step was to process real data to fully validated the design. The following procedures were applied in order to assess how the filter performs in practice:

• Static setup: robot is always static, target is static;

• Dynamic setup: robot follows a pre-defined path, meanwhile the target is dynamic, and visible around 90% of the time.

A total of eight (8) videos were recorded, four (4) for each setup. Frames are extracted from all videos, and manually marked for the static setup scenario only, due the fact this procedure is burdensome for the other cases. The neural network is then fed with the frames (320x240 resolution), and its outputs logged. Moreover, data from the robotic system (ego-motion) is merged with the output of the neural network, for the dynamic setup scenario. Finally, the combined data is processed in MATLAB with the full Kalman filter implementation, with the position and speed estimates being plotted for each video, alongside the residual for the static setup case only (it requires the real coordinates of the target). Finally, a different video is composed and marked with both raw (neural network) input and (Kalman) filtered output in order to better observe the results - raw inputs, filtered outputs, and prediction outputs are marked with green, red, and magenta circles, respectively. The latter (prediction outputs) corresponds to the prediction of the filter when the neural network outputs a false-negative. Note that the neural network timing was respected (videos were sub-sampled: 60 fps to 20 fps/Hz), alongside the refresh rate of the ego-motion data (140 Hz), for both setups.

Static setup

For the static setup, only data from the neural network is relevant, due the fact the robotic system is not moving, thus ego-motion values correspond to zero (0). In this case, the Pre-Kalman filter module would always output pixel speeds equal to zero (0) as well, not impacting the filter. Note, however, that the prediction step is executed at 140 Hz due the fact it is dependent on this data. Only the finest two (2) results (out of 4) are presented in this section, with a sample frame extracted from both cases being shown in Figure 66. Notice both are marked with the raw input (green circle) and filtered output (red circle).

Figure 66. Marked sample frames extracted from output videos

The position and speed estimates for both cases and all iterations are shown in Figure 67 and Figure 68 below:

Figure 68. Speed estimate for both static cases

With respect to the graphs above, the Y coordinates were negated in order to reflect the actual pixel coordinate on the image and the estimates (filled circles) are chronologically ordered with a gradient from blue to yellow (Figure 67). Note the neural network does not deviate much, being completely stable for case (Figure 67b), thus the final speed estimates converge to zero. Moreover, both cases are rather similar, and converge to the measurements (open circles) after approximately 1 second (20 complete iterations).

Considering this setup is static, and consequently the target’s position also, the residual can be further analyzed:

Figure 69. Position residuals for both static cases

The residuals of both cases are within ±1𝜎 after initialization, with the Y coordinate slightly outperforming X. Not only does the filter converge, but additionally presents an exceptional behavior for this setup. This setup validates the previously discussed design of the Kalman filter, although the control input is set to zero (0) and the fading memory technique is not properly tested in these cases, because the target is static. However, the adaptive filtering was applied for the simulations presented and does not degrade the performance of the filter.

Dynamic setup

Differently from the previous case, for the dynamic setup the ego-motion and neural network data must be merged, due the fact both robot and target are moving – thus, the Pre-Kalman filter module is included in the simulation. The former follows a predefined route, and the latter is moved around manually with a string attached to its front. Moreover, in these cases the neural network might not be able to identify and localize the target, due to its accuracy or occlusions that might occur for a few frames. As such cases frequently happens in reality, and in the final implementation must be treated accordingly, for now every time they occur, the update step is obviously not executed (i.e. no measurement available) but the respective output from the prediction step is stored. In fact, the final implementation will handle these cases in this way, but only for a few frames (i.e. measurements), with the filter being reset otherwise.

Only the best (and clearest) result (out of 4) is presented for this case, with a sample frame extracted from the same scenario being shown in Figure 70. Notice Figure 70b is marked with the raw input (green circle) and filtered output (red circle), meanwhile Figure 70a is marked with the prediction output (magenta).

Figure 70. Marked sample frames extracted from output video

Although only one video is processed, an additional test was performed to further explore the Pre-Kalman filter impact and response to the ego-motion. It corresponds to considering (Figure 71a) or not (Figure 71b) the ego-motion data, with the respective position and speed estimates for each case shown below:

Figure 71. Position estimate for dynamic setup, considering (a) or not (b) the ego-motion

Figure 72. Speed estimate for dynamic setup, considering (a) or not (b) the ego-motion

Once more the Y coordinates were negated to reflect the actual pixel coordinate on the image and the estimates (filled circles) are chronologically ordered with a gradient from blue to yellow. On the other hand, the neural network does substantially deviate in the dynamic scenario, but is clearly smoothed by the Kalman filter. Moreover, both cases present similar results except during the first (dark blue) and last iterations (orange/yellow), due the fact the ego-motion data impacts the Kalman filter: in the beginning, the robot is moving forward, the pixel coordinates under consideration (initially 0) are expected to move diagonally, from top to bottom, to the left; in the end, the opposite happens with respect to the direction, due the fact coordinates are on the right side of the image, hence move from bottom to top (considering the image coordinate system). Finally, although not visible, the false-negatives and occlusion cases are far better predicted for the case in which the ego-motion is considered. Notice the speed estimates (Figure 72) reflect the previously discussed behavior on considering or not the control input: when the ego-motion is not considered (Figure 72b), the output of the filter intrinsically combines the target speed (in px/s) and the ego-motion impact, thus not corresponding to the actual target speed only that happens in the other case (Figure 72a).

In summary, the designed Kalman filter presented an exceptional performance, in both static and dynamic cases, including the ego-motion data. Moreover, it validates both the Pre-Kalman module and the Kalman filter itself, in addition to the fading memory technique.

In document Computationally Efficient Vision based Robot Control (Page 75-80)