Programing languages for developing ontologies

CHAPTER 2: Review of ontology, Semantic Web technologies, and existing select

2.2 Designing, developing and editing ontologies

2.2.2 Programing languages for developing ontologies

2.2.2.1 RDF/XML

Resource Descriptive Framework (RDF) is an ontology language for representation of metadata and information about resources on the Web using precise formal vocabularies for ac- cess and use over the World Wide Web (WWW) [Pulido et al., 2006]. RDF and RDFS (RDF schema) provide a basic foundation to the higher level semantic language (Web Ontology Lan- guage, OWL). RDF is a standard that was published by the W3C [Pulido et al., 2006]. More importantly, RDF can be used to represent metadata and information in a way that “computer ap- plications can use and process in a scalable manner” [Yu, 2011]. Hence, RDF is the building block of the Semantic Web.

When developing an ontology, there are two rules about RDF to keep in mind. First rule states that “knowledge (or information) is expressed as a list of statements, each statement takes

the form of subject-predicate-object, and this order should never be changed” [Yu, 2011]. These statements are called triples (Figure 2.2) since they always consist of the three components: subject-predicate-object. Each triple represents a single fact (e.g. Aquifer has Permeability). A collection of triples is called an RDF graph (Yu 2011). RDF graph can be created from a database in which each table identifier becomes a subject, each column identifier becomes a predicate, and each cell value becomes the object. For example, a table called ‘Aquifer’ (translated into an ontology as a class) may have a column called ‘porosity’ (translated into an ontology as a property), which for a specific row has a value 10% (translated into an ontology would be repre- sented as an XSD type – defined in OWL section). The triple (subject predicate object) is then: Aquifer porosity 10%. RDF graph allows for all the values in a database to be converted into triples (RDF statements), which is structured information that computers are able to understand applying the inference rules of the language. In RDF, everything that a subject and object denote is called resource, and resources have a URI, which is a Uniform Resource Identifier. The URI is only used to identify a resource unlike the URL (Uniform Resource Locator) that is also used to retrieve the resource [Yu, 2011].

Figure 2. 2: Visual representation of the triple set.

Second rule in RDF states that “the name of a resource must be global and should be identified by the Uniform Resource Identifier (URI)” [Yu, 2011]. A resource is anything (con- crete or abstract) that the subject and object denote. URI, which is an identifier for a resource, is

Object Predicate

different from the uniform resource locator (URL), which is used to locate a resource (e.g. home page) on the Web.

In addition to being able to assert statement in RDF, inference is possible in RDF. Infer- ence is the derivation of other information from a set of stated (asserted) information. For example, if a class is created for an unconfined aquifer as a subclass of aquifer, then the computer knows that the unconfined aquifer has the same attributes as the aquifer (i.e. hydraulic conductiv- ity, transmissivity, and storativity). When a knowledge base is queried, substantial amount of additional information is inferred from the asserted statements in the underlying ontology by reasoning. RDFS is able to perform a limited set of complex semantic expressions (i.e. equivalent class, union, restrictions…), which are complemented by the RDFS and OWL that is discussed in the next section.

2.2.2.2 OWL

Web Ontology Language (OWL) extends RDFS by adding new constructs that allow for better expressiveness, and facilitate interoperability among distributed resources. OWL is a more superior ontology language because it allows to develop better applications that have much stronger reasoning ability [Yu, 2011] [Pulido et al., 2006] [Horrocks et al., 2003]. Some of the major advantages of OWL over RDFS include OWLs ability to use XSD data types, define complex classes with Boolean combinations, define properties and sub-properties, define classes or properties as equivalent, allow for cardinality constraints, set classes as instances, and allow importing other ontologies.

Moreover, OWL allows for addition of new information to knowledge base without falsi- fying previous conclusions. OWL has three species: OWL-Lite, OWL DL, and OWL Full, and

in Protégé one can select specie that they want to use. Based on [Horrocks et al., 2003] [Pulido et al., 2006] [Yu, 2011], these three species (dialects) have different limitations and restrictions on classes, and these limitations and restrictions are explained respectively. OWL-Lite supports class and property hierarchies and simple restrictions, allowing us to develop thesauri and simple ontologies, but some constructs (i.e. owl:hasValue, owl:disjointWith, owl:unionOf, owl:com- plementOf, owl:oneOf, owl:minCardinality, owl:maxCardinality, and owl:equivalentClass) are not allowed in OWL-Lite. OWL DL is a decidable version of OWL Full, with some limitations (i.e. owl:imports cannot import an OWL-Full ontology and for the importing ontology to remain an OWL-DL ontology). OWL Full has no limitations, however, it is not decidable.

2.2.2.3 SWRL

Semantic Web Rule Language (SWRL) is a combination of Web Ontology Language (OWL) sublanguage profiles (e.g., OWL DL or OWL Lite) with Rule Markup Language (RML) [Horrocks et al., 2004] [Mei, 2006]. SWRL is a language developed for creating rules in ontolo- gies that would allow them to express more complex relationships among classes and properties [Horrocks et al., 2004]. Each of the rules consist of an antecedent atom and a consequent atom, in which the antecedent is verified and if true then the consequent is satisfied. Atom is an ex- pression that can be written in the form a C (x), P (x, y), or greaterThan (x, y) in which C is a OWL class, P is a OWL property, and x and y are variables or class individuals. For example, Table 1.1 shows a SWRL rule that is created for AlkalineSample in which the antecedent is made of two atoms that yields a consequent with one atom. First atom consists of a ph.PHvalue property that has x and y variables, which are the individual (?x) and the pH value for the indi- vidual (?y). Second atom, checks if the value from variable y is greater than 7.0. If the previous

condition is true, then the consequent atom containing haveAlkalineSample property will clas- sify the individual x as an individual of class AlkalineSample because haveAlkalineSample is the predicate for the AlkalineSample class.

Table 2. 1: SWRL rule for the AlkalineSample class.

AlkalineSample ph.PHvalue(?x, ?y), greaterThan(?y, 7.0) -> haveAlkalineSample(?x, true)

In document Contaminant Hydrogeology Knowledge Base (CHKb) of Georgia, USA (Page 38-42)