There are many approaches to annotation and there are various levels of annotation. An object may have one or more annotations associated with it. These annotations may be associated with particular parts or regions of the object rather than the whole. In the
Chapter 7 Semantic 3-D Object Annotation 124 case of a 2-D image, annotations could point to the sun, sky or beach. For a 3-D object, annotations such as arm, leg or head could be applied to parts of a statue.
There are several approaches to automatically annotating objects. A simple approach would be to assign the annotation that most frequently occurs in a reference data set, but this method ignores any other knowledge about the object. Another method is to nd the most similar object in the reference set and copy the annotations from that object to the query object (nearest neighbour classication). However it is possible the query object is very dierent to any in the reference set, yet it will still have annotations applied to it. Alternatively the correct annotations may be present across several similar objects instead of just one. In this situation complex class boundaries may not be fully represented by the reference objects; a more sophisticated approach is needed. Barnard et al. (2003) gives an overview of two general classes of annotation models, Multi-Model Hierarchical Aspect Models and Mixture of Multi-Model Latent Dirichlet Allocation. These models attempt to nd a mapping between terms and regions. For example, the term sky may lead to mostly blue regions and mostly green regions may map to the term grass. Thus when searching for the term sky, unannotated images with blue regions can also be returned.
Duygulu et al. (2002) used machine learning techniques to associate a xed number of terms to regions of an image. They treat the problem as one of language translation, where terms are one language and features in regions are another. Each region was determined by a segmentation algorithm and terms were associated to each region using Expectation Maximisation to translate regions to terms. Essentially the process nds similar regions in dierent images that contain the same terms. The probability of the term occurring for such regions increases with how strongly correlated the term and regions are.
Semantic Web technologies aim to impose a machine readable structure to any content published on the Web (or otherwise). For any individual item of data, a tag of some kind needs to be associated with it to say what the information is. Information is structured according to some schema or Ontology describing the concepts and relations between concepts. This allows any program that understands a particular schema to understand any data structured according to that schema. A schema or ontology can be described in several dierent formats. Typically RDF (Lassila and Swick, 1999) is used, however it is unable to model the full range of relationships that can be used to model an ontology. OWL, the Web Ontology Language (McGuiness and van Harmelen, 2004) is based upon RDF and provides a language to fully describe an ontology.
In an eort to help interoperability, there are several ontologies dened with the Dublin Core (DCMI Usage Board, 2004) being one of the more well known ones and often used as a basic level of interoperability between systems. This is an ontology to describe a resource. A resource can have such elements as creator, title, description and date.
Chapter 7 Semantic 3-D Object Annotation 125 <?xml version="1.0"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:contact="http://www.w3.org/2000/10/swap/pim/contact#"> <contact:Person rdf:about="http://www.w3.org/People/EM/contact#me"> <contact:fullName>Eric Miller</contact:fullName> <contact:mailbox rdf:resource="mailto:[email protected]"/> <contact:personalTitle>Dr.</contact:personalTitle> </contact:Person> </rdf:RDF>
Figure 7.1: Dublin Core Example
Figure 7.1 shows an example taken from Manola and Miller (2004) describing contact details for a person.
Typically a group of experts in a particular domain will produce an ontology for that domain. In the case of SCULPTEUR, the CIDOC CRM (Crofts et al., 2001) was used as this describes the cultural heritage domain.
All of this structured content would be of little use if there was no way to search it, and several query languages have been developed. One of the more commonly used languages is RDQL (Seabourne, 2004), although there are several other competing languages oer- ing more sophisticated querying. SeRQL is the query language used in Sesame (Aduna BV, No Year), a RDF database system. SPARQL (Prud'hommeaux and Seabourne, 2006) is the other language, a W3C recommendation.
Typically RDF is stored in a RDF database which provides one or more query languages to manage the data. Sesame (Aduna BV, No Year) has already been mentioned. This supports both RDQL and SeRQL and it also allows inference rules to be dened which are applied to data as it is imported into the database. Another database is the triplestore (Harris et al., No Year) supporting RDQL and SPARQL query languages. Jena (Hewlett- Packard Development Company, No Year) is another database supporting RDQL and SPARQL and it provides a customisable semantic reasoner which allows inferencing. There has been some interest in using a Shape Ontology to describe a 3-D object from a collection of known primitive components. García-Rojas et al. (2005) use virtual humans as the basis for their work. They use this technique to identify parts of the human body. This technique however is directed towards a single class of objects (human shaped objects in this case) rather than a range of dierent objects.