User Review Analyses using Semantic Web Ontology Tools

(1)

International Journal of Advanced Engineering Science and Technological Research (IJAESTR) ISSN: 2321-1202, www.aestjournal.org @2016 All rights reserved

91

User Review Analyses using Semantic Web Ontology Tools

Gaurav Jaglan Aman Jolly

USIT, New Delhi HMRITM, New Delhi

jaglan.gaurav@gmail.com jollyaman.jolly663@gmail.com

Abstract: In today’s world, Data Analysis has become the integral part of development. it has its application in many fields

incorporating: business, strategy planning, semantic web, robotics, Text Based Data e.g. Natural Language Processing & predictive modelling

This paper demonstrates a Data analysing & interpreting system in JAVA environment using Jena API. The aim is to have automatized data analysis and interpretation using a given ontology with an intent to contribute to semantic web.

Key words: Jena API, Data Analysis, Ontology, Semantic web, JAVA environment.

Introduction

Semantic Web is actually an extension of the current Web(2.0) in that it represents information more meaningfully for humans and computers alike. It enables the description of contents and services in machine-readable form, and enables annotating, discovering, publishing, advertising and composing services to be automated. It was developed based on Ontology, which is considered as the backbone of the Semantic Web. In other words, the current Web is transformed from being machine-readable to machine understandable.

The Semantic Web [2] which is distributed and heterogeneous, has brought the evolution of the Web to a higher level. There are two visions of the future in the development of the Web, the first being to improve its usability as a medium for collaboration and the second to ensure that its contents can be understood by machines.

Providing annotation data will facilitate this second aim.

Tim Berners-Lee, who invented the WWW and has worked on the Semantic Web, states that the latter “is not a separate Web but an extension of the current one, in which information is given a well- defined meaning, better enabling computers and people to work in cooperation.” [1]. Thus, the Semantic Web is distinguished by a more meaningful representation of information for humans and computers, providing a description of its contents and services in machine-readable form; moreover, it enables services to be automatically annotated, discovered, published, advertised and composed. It thereby facilitates interoperability and the sharing of knowledge over the Web. Its main goal is therefore to make information on the Web accessible and understandable by humans and computers.

In fact, both the Semantic Web and Web services are considered to be a set of resources, identified by the URI. The difference

between them is that Web services use HTTP to display the contents of a page, while the Semantic Web tries to create machine readability by semantically representing the data or information in resources. Numerous tools and applications of Semantic Web technologies have recently become available.

Figure 1: Semantic Web architecture [2]

The layers of architecture represented in Figure 1 are briefly described below:

 URI and Unicode: To identify and locate resources, or indeed anything on the Web, a uniform system of identifiers (URIs) is used. The URI, which is considered to be the foundation of the Web, is used to give a unique name to each resource. Unicode is the standard for computer character representation.

 Extensible Markup Language (XML) is a markup language, which means that it is machine-readable and has its own format. It is widely known in the WWW community because it has a flexible text format and was designed to describe data and to meet the challenges of large-scale e-business and electronic publishing; it plays an important role in exchanging different types of data on the Web. In fact, it is the basis of a rapidly growing number of software development activities. Each document starts with a namespace declaration using XML Namespace.

 The Resource Description Framework (RDF) is the first layer of the Semantic Web. RDF is a framework for using and representing metadata and describing the semantics of information about resources on the Web in a machine-accessible way. It uses URIs to identify Web resources and to describe the relations between these resources, using a graph model.

(2)

International Journal of Advanced Engineering Science and Technological Research (IJAESTR) ISSN: 2321-1202, www.aestjournal.org @2016 All rights reserved

92

While describing classes of resources and the properties between them, using RDF Schema (which is a simple modelling language), it also provides a simple reasoning framework for inferring types of resources.

 Ontology Vocabulary is a language which provides a common vocabulary and grammar for published data as well as a semantic description of the data used to preserve the ontologies and to keep them ready for inference. Ontology means describing the semantics of the data, providing a uniform way to enable communication by which different parties can understand each other.

 Logic and Proof: In the Semantic Web, the building of systems follows a logic which considers the structure of ontology. A reasoner could be used to check and resolve consistency problems and the redundancy of the concept translation. A reasoning system is used to make new inferences.

 Trust is the final layer of the Semantic Web. This component concerns the trustworthiness of the information on the Web in order to provide an assurance of its quality.

Ontology

Ontologies [3], which are used in order to support interoperability and common understanding between the different parties, are a key component in solving the problem of semantic heterogeneity, thus enabling semantic interoperability between different web applications and services.

Recently, ontologies have become a popular research topic in many communities, including knowledge engineering, electronic commerce, knowledge management and natural language processing. Ontologies provide a common understanding of a domain that can be communicated between people, and of heterogeneous and widely spread application systems. In fact, they have been developed in Artificial Intelligence (AI) research communities to facilitate knowledge sharing and reuse.

The goal of ontology is to achieve a common and shared knowledge that can be transmitted between people and between application systems. Thus, ontologies [4] play an important role in achieving interoperability across organizations and on the Semantic Web [2], because they aim to capture domain knowledge and their role is to create semantics explicitly in a generic way, providing the basis for agreement within a domain. Ontology is used to enable interoperation between Web applications from different areas or from different views on one area. For that reason, it is necessary to establish mappings among concepts of different ontologies to capture the semantic correspondence between them.

However, establishing such a correspondence is not an easy task.

Because there are many different definitions of ontology, it is very difficult to find a definition that researchers can agree upon. The present research first presents some of these definitions which have been given from different perspectives, and then explores in depth those aspects of these definitions which are related to the topic under investigation.

The primary use of the word “ontology” is in the discipline of philosophy, where it means “the study or theory of the explanation of being”; it thus defines an entity or being and its relationship with and activity in its environment. In other disciplines, such as software engineering and AI, it is defined as “a formal explicit specification of a shared conceptualization” [5].

Jena API for Ontology:

Jena is an RDF platform, we restrict ourselves to ontology formalisms built on top of RDF. Specifically this means

RDFS [8], the varieties of OWL [12] and DAML+OIL [32].

While OWL builds on top of the RDF specifications, it is possible to treat OWL as a separate language in its own right, and not something that is built on an RDF foundation; see for example the OWL API [5], which merely uses RDF as a serialisation syntax.

The RDF-centric view treats RDF triples as the core of the OWL formalism. While both views are valid, in Jena we take the RDF centric view. As such, the ontology support within Jena addresses OWL Full features that are not present in OWL DL: e.g. the ability to use a single URI ref to denote a class, a property and a participant in some other ontological schema.

The Ontology layer defines the interface OntModel which extends the Model interface from the Model API.

Rather than having Java class names that are tightly bound to the language being processed (e.g. DAMLClass, DAMLObjectProperty, etc.), the ontology API is language neutral (thus the classes are OntClass and

ObjectProperty).

The statements that the ontology Java objects see depend on both the asserted statements in the underlying RDF graph, and the statements that can be inferred by the reasoner being used (if any).

Figure 3. The statements seen by the OntModel

The asserted statements are held in the base graph. This presents the simple internal interface, Graph. The reasoner, or inference engine, can use the contents of the base graph and the semantic rules of the language, to show a more complete set of statements - i.e. including those that are entailed by the base assertions. This is also presented via the Graph interface, so the model works only with that interface. This allows us to build models with no reasoner, or with one of a variety of different reasoners, without changing the ontology model. It also means that the base graph can be an in-memory store, a database-backed persistent store, or some other storage structure altogether (e.g. an LDAP directory) again without affecting the ontology model.

Implementation and Analysis Procedure Let A and B be two considered Products, then For A, LA be the consolidated number of likes for A SA be the consolidated number of shares for A

(3)

International Journal of Advanced Engineering Science and Technological Research (IJAESTR) ISSN: 2321-1202, www.aestjournal.org @2016 All rights reserved

93

For B, LB be the consolidated number of likes for B

SB be the consolidated number of shares for B

Let L be the total number of likes awarded to Products Let S be the total number of shares awarded to Products

SUCCESS_RATE – the analysed success rate of a specified Product among all the Products Mentioned in the Ontology library(the Individuals of the Shopping.owl file)

Like_Ratio = LA/L Share_Ratio= SA/S

Case NO. Likes(L) Shares(L) CONSIDERATION

1. LA<0 SA<0 NO

2. LA<0 SA>0 NO

3. LA>0 SA<0 YES

4. LA>0 SA>0 YES

SUCCESS_RATE = Like_Ratio*20 + Share_Ratio*80 -The SUCCESS_RATE is given in percentage.

-The Shares values of a post boost up the reach of the post hence, it is given a 4X coefficient then the Likes (valued at 20).

EXISTING SYSTEM

Presently, the data analysis based on user opinion is measured manually by reading and interpreting reviews. And searching them on internet is the tedious task. Most reviews are on social media rather than on an e commerce sites. Keeping the track of user reviews, analysing and interpreting them manually requires a lot of time and man power.

PROPOSED SYSTEM

We have developed the system in java environment that analyse extracted data using JENA api ontology tools. These tools are used in decision making as they are having certain classes i.e. the exhaustive organisation of knowledge domain that is usually hierarchical to judge whether the review of an entity is positive or negative. After judging the review they interpret success rate of the product show consolidated liked and share. Likes and share of positive reviews help in increasing the success rate of the entity where negative reviews of an entity along with its likes and share decreases its success rate.

FUTURE ENHANCEMENTS

The future plans for the research is to enhance it to cover many more topics and also making online tutorials with the help of faculty.

we are also planning to incorporate Social media including social networks and blogs discussing or participating in Product and Service reviews prior to the data extraction. This will provide general

The Project will have the prominent features including:

• Social media Data Extraction.

• Enhancing and updating the Ontology by fetching more Product titles.

• Enhancing the Reviews experience by Synaptic analysis libraries.

In the presented system, Post Title cannot have multiple product names in the post simultaneously:

Product A and Product B having included in same post title are considered for only one ignoring the other Product. Hence consolidated value of likes and shares for other product is hampered or not considered.so we are planning to build a system that could be capable of handling multiple product names in the post simultaneously.

Also the Post about a product cannot be neutral. A Post title is always entitled to take values of product and its review. The review is either positive or negative, based on which the Success Rate is calculated.so we are planning to consider that too in the future

Conclusion

In this paper, we have presented the implementation of and robust and efficient Data analysing and interpreting system in JAVA environment using Jena API. We have demonstrated the general steps that could be used for its implementations. we have been able to provide a Program which can analyse the Success Rate of Products and Services available in Ontology. Finally we hope that this will go a long way in popularizing my work by providing easy and safe way to analyse the Products and Services based on reviews to give an overview favourability of the product in the consumer market.

Snaps Shots

1. Showing the console for printed values of Individuals

(4)

94

2. Consolidated likes and shares value for products in product list.

(5)

95

3. SUCCESS RATE FOR PRODUCTS from the data file(Data.txt)

4. Data.txt file showing extracted data values of post Line 1: Total number of Posts

Line 2: Post_title representing name of Product and its review(positive or negative) Line 3: Number of Likes

Line 4: Number Shares

Line 6 and onwards : Post_title, likes and shares of following posts.

(6)

96

REFERENCE

[1] T. Berners-Lee, J. Hendler, and O. Lassila, "The Semantic Web", Scientific Am, May 2001, pp. 34–36

[2]. D. Tidwell, "Web Services-The Web’s Next Revolution", IBM Web Service Tutorial, 29 Nov. 2000, http://www- 106.ibm.com/developerworks/edu/ws-dwwsbasics- i.html.

[3]. H. Mihoubi, A. Simonet, and M. Simonet, "An Ontology Driven Approach to OntologyTranslation", In Proceedings of DEXA, 2000, pp.573-582

[4]. D. Fensel, "Ontologies: Silver Bullet for Knowledge Management and Electronic Commerce", Springer, 2001.

[5] T. R. Gruber, A translation approach to portable ontology specifications (1993) 199-220.

[6] Data electronically available on www.w3c.org on January 24,2016