Web Search Agents and Multi-agent Systems

2.2 The Semantic Web

2.2.6 Semantic Search

2.2.6.3 Web Search Agents and Multi-agent Systems

The use of RDF and OWL tags in Web pages provides the opportunity for more advanced searching of Web content through the development of semantically enabled search engines. Several major companies including Microsoft have recently been investing in the development of a new breed of search engines called Web search agents. Web search agents do not perform like commercial search engines which use database lookups from a knowledge base. Instead, Web search agents can crawl the Web itself searching for RDF and OWL documents, while at the same time providing an interface to the user. They can be programmed to facilitate user queries including determining and executing a query plan, and can be designed to initiate middleware environment tasks. The applications are typically developed in a Java programming environment because of Java’s powerful server side programming capability, and the fact that most middleware applications (see sub-section 2.2.7.2) can be readily interfaced with Java. Alesso et al.(2004d) contend that Microsoft’s MSNBot program44_{, which performs agent/robot like functions and searches}

44_{http://search.msn.com/docs/siteowner.aspx}

the Web to build an index of HTML links and documents, may pose a serious threat to Google. Figure 15 shows the typical work flow functionality of a Web search agent.

A multi-agent system (MAS) is a loosely coupled network of software agents that interact to solve problems that are beyond the individual capacities or knowledge of each problem

solver45_{. Bloodsworth and Greenwood (2005) state that by placing Semantic Web}

technologies at the heart of a multi-agent system it is possible to create a system in which agent behaviour and internal representation are abstracted from coding. Each agent in the system uses this layer, in addition to instances, to form a knowledge base defining its behaviour. The ontology-layer is a mixture of domain specific and generic ontologies, which structures the behaviour of a multi-agent system. Bloodsworth and Greenwood (op. cit) believe that such a level of abstraction makes editing the behaviour of agents more convenient, requiring only the altering of domain specific ontologies without any major changes to the coding of the system. This ontology-centric approach encourages re- use, allowing the system to move from one problem domain to another by creating an ontology layer defining the new environment and system behaviour. These features make the future possibilities of such methods exciting.

45_{http://www.cs.cmu.edu/~softagents/multi.html}

Comprehensive designs for a Semantic Web based multi-agent system were presented in Abrahams and Dai (2005a). In this environment individual agent behaviour is driven by intentions that are determined by problem solving logic coded into the agent. The agents interact to perform tasks such as: 1) crawling the Internet at regular intervals to search for RDF marked up documents consistent with the domain ontology; and 2) extracting RDF content and storing it in an RDF enabled database, which forms part of a Jena supported middleware environment maintained on a Web server. The GUI is accessed remotely by an end user searching for information in the same way as a conventional search engine. User requests are passed to the Web agents who, in turn, formulate a query plan. Inference is performed on ontology schema information and instance data by the activation of a reasoner, which is a component of the middleware. SPARQL queries are formulated and processed by the agents in conjunction with Jena and results displayed to the end user via the GUI. The multi-agent system is presently under development as part

of the Phoenix46_{research program. The main theme of the PHOENIX project is}

applications integration through EAI (Enterprise Application Integration) processes and infrastructures to support real-time service oriented enterprise tasks. The high level architecture of the Phoenix multi-agent system is presented in Figure 16.

46_{http://www.staff.vu.edu.au/PHOENIX/phoenix/index1.htm}

The numbers shown in figure 16 correspond to the following key processes as described by (Dai & Abrahams 2005):

1. Coordination agent instructs domain agents to crawl the Internet to update domain ontologies and search for RDF annotated Web sites.

2. Domain agents search for and download relevant domain ontologies from the Web. 3. The ontologies are sent by the domain agents to the Jena agent, which is responsible

for interacting with the Jena middleware application.

4. Having established a connection to the Jena middleware, the Jena agent creates a Jena ontology model and saves the model using Jena’s persistent storage capability linked to a backend database.

5. Domain agents crawl the Internet searching for and downloading Web pages with RDF markup containing a matching namespace to their domain specific ontology. 6. Domain agents extract the RDF annotations from the Web pages and send them to the

Jena agent.

7. The Jena agent, having maintained a connection to Jena middleware, writes the extracted RDF markup into the relevant ontology model contained in the persistent storage database.

8. End user issues requests for a travel service via the GUI.

9. GUI accepts the user request, converts the request to an XML form and sends it to the interface agent.

10. Interface agent receives the user request and transforms the task descriptions into technical specifications which are then are passed to the Coordination agent.

11. Coordination agent divides tasks into subtasks, formulates a plan and allocates subtasks to domain agents.

12. Domain agents formulate a number of possible solutions to their specific tasks and convert the solutions into query specifications. The query specifications are each given a ranking based on best match to user request. Specifications are then sent to Jena agent.

13. Jena agent converts the query specifications into SPARQL query language format using parameters and predefined query templates. Jena agent also invokes the Racer reasoner to classify the ontology models which now contain both schema and instance data for each domain. Jena agent then initiates SPARQL queries over the inferred ontology model.

15. Results are sent back to the domain agents.

16. Domain agents sort the query results into their ordered hierarchy and send them to coordination agent.

17. Coordination agent confirms that a solution has been found. It determines how results are to be displayed (order and number of hits etc.) and sends the requirements and results to the interface agent.

18. Interface agent converts the results to HTML, formulates a page layout and passes results to the GUI.

19. GUI displays the results to the end user.

In document TOURISM INFORMATION SYSTEMS INTEGRATION AND UTILIZATION WITHIN THE SEMANTIC WEB (Page 61-65)