Continuous tracking and tracking stability

3. Marker-based tracking

3.2 Marker pose

3.2.5 Continuous tracking and tracking stability

Some implementations detect markers separately frame by frame. Applications can boost performance by keeping history information on marker appearance and tracking the markers continuously. Based on the information from the previous frames, the system can identify markers that otherwise would be too small to be identified, for example, or that are partially occluded. In other words, after decod-ing the marker data once, it is sufficient to detect the marker in the next frames without decoding its content again.

In addition, if an application keeps continuous track of the marker pose, it can filter the pose over time and thus detect outliers (flaws in marker detection) and handle inaccuracy in marker detection. Furthermore, it can use the previous pose as an initial guess for the iterative pose calculation method. With a high frame-rate, the camera normally moves only slightly between frames and this is a good initial guess. Should the frame rate be slow, the pose may change significantly between frames, and the system should calculate a new initial guess. The system may also predict camera movement based on several previous poses and use the predicted pose as an initial guess.

If marker detection fails for some reason and markers are detected frame by frame separately, the augmentation often “jumps” from the correct position to a

random position, which is annoying for the user. A continuous tracking system is able to avoid this situation. Furthermore, in continuous marker tracking, markers are less likely to be confused with others as the system assumes that the marker is near its previous location. In addition, when the marker identity is difficult or impossible to detect because it is too small, for example, a continuous marker system is able to conjecture it based on the historical information.

Marker edges may be shifted due to a non-optimal threshold value or blurred image, for example. The corner positions are then respectively shifted, which then implies that the marker pose is inexact. This imprecision appears as a small oscil-lation in the augmentation. A continuous marker tracking system is able to reduce this oscillation with proper filtering. As always with filtering, filtering parameters must be chosen with care to find an optimal solution; if the pose filtering averages too strongly, the system is unable to react to fast movements; if it averages too little, the oscillation remains.

The amount of the data that a marker can encode depends on the amount of cells it contains. Typically, the more cells a marker has – the smaller is the size of the physical cells, if we keep the marker size fixed. However, bigger physical cell size makes it easier for the application to be able to detect and decode the marker from a farther distance away. Thus, having a large amount of data to encode (i.e.

needing a large number of cells per marker) limits the maximum detection distance.

A continuous marker tracking system could overcome this limit using super-resolution images. Super-super-resolution images are high-super-resolution images integrated over time [88, 89]. The contents of a marker could be defined using a super-resolution image. This way a system could have a higher number of cells in a physically smaller marker and still be able to decode its content. An application using a continuous marker tracking system would need to create a super resolu-tion image of each marker only once, and thereafter the system could concentrate on following each marker without decoding them again.

The stability of an augmented reality tracking system can be improved with several methods. For example, the Kalman filter (KF), extended Kalman filter (EKF) [90] and single constraint at a time (SCAAT) method [91] are used to predict marker or camera movements and to stabilise tracking results. For instance [92]

uses a SCAAT algorithm to compute the estimated pose using infrared beacons in addition to three gate gyros and GPS sensor, and [93] uses SCAAT-Kalman filter-ing for real-time trackfilter-ing with unsynchronised cameras.

3.2.6 Rendering with the pose

The main idea of augmented reality is to present virtual objects in a real environ-ment as if they were part of it. The camera pose is used to render the virtual object in the right scale and perspective. The virtual camera of computer graphics is moved to same pose as the real camera and virtual objects are rendered on top of the real image (Figure 36).

Figure 36. Augmentation in origin.

If a virtual object is rendered using camera transformation matrix T and camera matrix K, it appears on the origin, in the same orientation as the coordinate axes. If a system wants to augment an object in different pose, it needs to add object transformation Tobject in the rendering pipeline (Figure 37).

Figure 37. Augmenting in a general location and orientation.

If both the camera and marker are moving, the tracking system is able to derive the relative pose of the camera and the marker, but the absolute position (relative to the earth coordinates) is unknown. Sometimes it is convenient to use an addi-tional device to orient the object, e.g. upright. The system can do this using an accelerometer, which is a built-in sensor in most new smartphones (e.g. iPhone 4G)). The accelerometer provides a gravitation vector in the phone’s coordinate system. The task is to change it to the world coordinate system (change of

coordi-vector. The application needs to add this rotation R in the rendering pipeline. Simi-larly, an application could orient augmentations according to coordinate data from a digital compass.

Figure 38. Augmenting upright pose using accelerometer, the gravitation vector and the inverse gravitation vector aremarked with dashed arrows.

In document Augmented Reality Applications (Page 56-59)