• No results found

Ontological Textual Object Representation

4.5 Perception and Representation of Objects

4.5.6 Ontological Textual Object Representation

An appropriate object format for the Long-Term Memory of the robot is a second key feature of the proposed Object Attention System. Such an object information container has to provide all relevant information that is needed to recognize a formerly learned object on the one hand, and a mechanism to store all object information that the user wants to memorize in the robot’s memory on the other hand. The following exemplary XML-based object representation illustrates which content is stored and how it is structured.

1 <?xml version="1.0" encoding="UTF-8" ?> 2 <OBJECT> 3 <ID>10</ID> 4 <SCORE>0.78125</SCORE> 5 <TIMESTAMP>1140889113943</TIMESTAMP> 6 <BESTBEFORE>1148665102531</BESTBEFORE> 7 <TIMESTAMP_HUMAN_READ>25.2.2006, 18:38:33</TIMESTAMP_HUMAN_READ> 8 <BESTBEFORE_HUMAN_READ>26.5.2006, 19:38:22</BESTBEFORE_HUMAN_READ> 9

10 <!-- Position based on global map in [m] -->

11 <GLOBAL_POSITION> 12 <X>0</X> 13 <Y>0</Y> 14 <Z>0</Z> 15 </GLOBAL_POSITION> 16 17 <RELATIVE_POSITION>

18 <!-- (-) -> left of robot : (+) -> right of robot in [deg] -->

19 <ANGLE>0.0838469</ANGLE> 20 <HEIGHT>0.539959</HEIGHT> 21 <DISTANCE>0.96797</DISTANCE> 22 </RELATIVE_POSITION> 23 24 <FEATURES>

25 <COLOR confidence = "0.85">red</COLOR> 26 <OWNER confidence = "0.3">Axel</OWNER>

27 <SOUND>/memory/objects/10.ogg</SOUND>

28 <TYPE confidence = "0.85">laptop</TYPE>

29 <VIEW>/memory/objects/10.ppm</VIEW>

30 </FEATURES>

31 <RELATION confidence = "0.85">left to</RELATION> 32 <RELATED_TO confidence = "0.85">14</RELATED_TO> 33 </OBJECT>

4.5. Perception and Representation of Objects 81 At the beginning in line 3, the consecutively numbered object ID is specified. It is used to enable a common ground for the currently processed object refer- ence. Within the Object Attention System the ID ensures a consistent data as- signment over all process conditions. Furthermore, the external communication with other modules, in particular, Dialog, Gesture Recognition, Sound Collector, Scene Model, and the Modality Converter use this object ID. In the context of speech processing, the object ID can also be used for anaphoric resolution which is, however, currently not supported by the speech processing units.

Next, in line 4, a score value [0. . . 1.0] is given which is the mathematical prod- uct of all confidence values assigned in the Short-Term Memory, see page 61. It is used to decide whether an object is already known or not. Besides a cou- ple of features that need to match with features of already learned objects, an empirically determined value of 0.8 has proven as a reliable value for later ob- ject recognition tasks. In other words, all objects that provide a score value of a certain threshold are considered for the object recognition module.

As especially small objects, like, e.g., cups or books are most probably moved to another location from time to time and the robot is not always aware of these actions, two timestamps are included as well. In line 5, the timestamp when the object has been stored in the Long-Term Memory is specified, while the sub- sequent Best Before-timestamp limits the life cycle of the object. Although an active memory for robots is currently under development, this feature is not used yet. Nevertheless, manifold applications are imaginable, like an automatic mech- anism that lets the robot forget the once stored object. This is useful to hold the memory consistent as, for instance, it usually does not make sense to store the location of easy-perishable fruits for several months. As a second application, the robot can take initiative and verify on its own, whether the object is still at its once learned location. As these timestamps are not easy to interpret for humans due to their POSIX format, the same timestamps are denoted in human-readable form as well. They have mainly been implemented for manual maintenance tasks performed by the user, but they can additionally be used to let the text-to-speech component read the dates to the user in order to inform him about upcoming update cycles.

The following block of the XML document includes the position of an object within an absolute global coordinate system of the environment. It is used for robotic platforms with navigational and localizational capabilities. For instance, the global positioning system helps to assign a unique object position even in an environ- ment with different rooms. However, as the robotic platforms used, currently not support a positioning system, these values are set to zero in the given example. The only positions actually supported by the overall robot architecture are relative ones, related to the robot. Thus, the Object Attention System at least supports these locations as can be seen in lines 17 to 22 while the values are specified in cylindric coordinates.

The last semantic block of the textual object representation contains the learned object features, and, as far as available, references to relations related to other objects or locations, like “in front of the windows". In detail, the feature block con- tains all verbally specified feature types and their values, as well as the confidence values assigned by the Short-Term Memory. Furthermore, the location and the

82 4. Development of an Object Attention System names of learned object views and object sounds are given as well. Here, it has been proven as great advantage to use references to the actually stored data in- stead of an encapsulated object representation which includes textual and binary data at the same time. The advantage mainly consists in a compact data repre- sentation which improves memory queries. Additionally, the data can more easily be handled by, e.g., the object recognizer or the dialog system. However, the latter one only in cases if the learning of a view or a sound has been successfully completed before.

This XML-based object representation offers a great deal of advantages in con- trast to proprietary data formats. Besides the already mentioned flexible usability, it can easily be extended and updated. Nevertheless a lot of data analysis has to be done before such a document can be generated. Thus, in the following, the realized processing strategy is, therefore, illustrated in order to point out how the algorithms are applied by the proposed Object Attention System.