• No results found

3.3 Ontology Development with Prot´eg´e ODE

3.4.2 Interface Design and Search Engine Approach

Main features in the Onto-CropBase interface include a search space, a map for location info and a results panel. These three main features are available in all sections of the tool. The home page of the Onto-CropBase tool is designed with a form-based search engine and as location data is critical to crop-based knowledge systems especially that of the underutilized crops, a map interface is provided to display the crops location data. The map interface is embedded using the Google Maps JavaScript API version 3, which enables map features in a web application, including styled maps, place data, 3D buildings, and geocoding, among other features. The map information is extracted from the named location’s east-west bound longitudeand north-south bound latitude elements. The following design approaches makes up for the Onto-CropBase features:

3.4.2.1 Search Approach.

In the Onto-CropBase search engine, a search begins with a keyword entered in the Search area and the query results are returned as a set of navigational links based on the subjects of the data sources—using their title annotation provided in the corresponding local ontology, see Fig. 4.12. Users can then browse the list of subject titles to explore the remaining information. Clicking on a particular sub- ject will present the RDF assertions as subjects and objects pairs. For example, in Fig. 4.12, Subject number 3 is a caption for Carbohydrate of Bambara groundnut and the corresponding object asserted as its value is High (65%).

3.4.2.2 Query Language.

SPARQL is used as the query language, which is a recursive acronym for SPARQL P rotocol And RDF Query Language (SPARQL), and a standard query language for retrieving information stored in RDF graph or triples [167]. A basic structure of a SPARQL query includes the SELECT, CONSTRUCT, ASK, and DESCRIBE statements followed by the WHERE clauses and GROUPBY clauses where appli- cable, see example in Listing 3.2. As mentioned earlier, we employ the Jena-ARQ [168], a query engine for Jena that supports SPARQL RDF Query language. This allows for federated user queries across the local ontologies using the correspond- ing terminologies of concepts found in the global ontology as search phrases. Recall that the local ontologies are linked to the global ontology using their URIs which is also supplied into the SPARQL queries.

3.4.2.3 Query Design.

Queries entered in the Onto-CropBase search space are embedded in a Java code containing a SPARQL query and the concepts are parsed as ’text strings’ to the QueryFactory method. The SPARQL’s SELECT query is first executed on the global ontology (the UC-ONTO) to retrieve the relevant concepts matching the search keywords in the global ontology. An OPTIONAL query pattern, as shown in Listing 3.2, is provided to recursively explore all the RDF datasets linked to the UC-ONTO through their URIs. The use of the OPTIONAL construct allows the Jena-ARQ query engine to search for all relevant triples with inferences from

the local RDF ontologies without a query execution failure — i.e. it acts as an exception condition in the event that the optional data does not exist. Moreover, the use of OPTIONAL query also helps to ensure that all non-optional facts are returned from at least, the global ontology. For example in the map query, in the event that one of the RDF subjects returned is a location data, a setMap object is then used to obtain the set of longitudes and latitudes of the location, for display on the map interface. The size of the resulting subject-predicate-object (SPO) triples, is also calculated for each dataset to determine the number of pages to be returned for each search result. In the case of Onto-CropBase, we set the capacity to 10 triple sets per page so as to allow the map view stay in focus.

1. "PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>"+ 2. "PREFIX rdf: <http://.../1999/02/22-rdf-syntax-ns#> "+ 3. "PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> "+ 4. "PREFIX ucnames: <http://.../ontologies/2015/Naming#>"+ 5. "PREFIX agronomy: <http://.../ontologies/2015/agrono#>"+ 6. SELECT distinct ?subject ?object " +

7. WHERE { " + " " + "

8. OPTIONAL " + " {?subject rdf:value ?object ." + " 9. ?subject rdf:type ucnames:"+ className +".} "+" "+" 10. OPTIONAL " + " {?subject rdf:value ?object . " + " 11. ?subject rdf:type agronomy: "+ className + ".}"}

Listing 3.1: SPARQL query showing the use of SELECT and OPTIONAL constructors

The above listing shows example uses of the SELECT and OPTIONAL con- structors in a SPARQL query to retrieve distinct subject-object pairs from two lo- cal ontologies: ucnames and agronomy. Note that the prefix declarations, lines 1 through 6, allow for abbreviating URIs, so that the short names can be used instead of the URIs in the query body.

3.4.2.4 The Query Processing.

As mentioned earlier, Jena-ARQ is employed as a query engine for executing our SPARQL queries. This is achieved through a sequence of five iterative steps as follows:

1. String to Query parsing: where the text string parsed to the QueryF actory method is structured from a query string to a Query object.

2. Algebra Generation: Next step is the algebra generation, which involves translation of the Query object to a SPARQL algebra expression using the SPARQL specification algorithm.

3. High-level Optimization: Third step is the optimization of the algebra ex- pression generated in (2) and is called high-level optimization and transfor- mation. Here, a Transformer class applies a transform code to convert or replace the algebra expression tree with more efficient expressions. Exam- ple of transform code function is in replacing the equality filter with more efficient graph pattern in algebraic expressions.

4. Low-level Optimization: In the next step, the final query plan is determined and is called the Low-level optimization. This involves deciding the order in which to evaluate the basic graph patters transformed earlier. However, this stage can be carried out concurrently with the fifth step,

5. Evaluation of the query plan: this involves executing the algebra expres- sions to generate the solution graph patterns, returned as sets of facts in SPO triples.

However, these steps can be extended and modified to allow searching for different graph-pattern implementations. Moreover, the final step (evaluation of the query plan), can also be enhanced to suit specific application requirements.