13 Available online at www.ijiere.com
International Journal of Innovative and Emerging
Research in Engineering
e-ISSN: 2394 – 3343 p-ISSN: 2394 - 5494
Survey on Recommendation System Using
Semantic Web Mining
Nidhi Madia
a, Amit Thakkar
b, Kamlesh Makvana
caCharotar University of Science and Technology Changa, India, [email protected]
bAsso. Prof, Charotar University of Science and Technology Changa, India, [email protected] cAssi. Prof., Charotar University of Science and Technology Changa, India, [email protected]
ABSTRACT:
Recommendation system becomes an essential in web applications that provide many services and suggest some services automatically as per user’s interest. To develop a recommender system, the collaborative filtering approach is the well-known approach. Collaborative filtering has a major issue called cold start that is how to recommend new user. The performance of this kind of system is depended on huge amount of data. In this challenging world use of internet is increasing rapidly, people approaches to internet to get and share knowledge through many online tools. In today’s web there are many problems related to size of data and to handle that much unstructured data. The solution of that is combination of semantic web and web mining that is “semantic web mining”. In this paper we have reviewed various approaches for solving cold start problem in recommender system for e-commerce application. Keywords: Recommender system, ontology, collaborative filtering, cold-start.
I. INTRODUCTION
With the rapid development of the World Wide Web, Recommender systems are become an important part of online sites, people can now obtain and transform knowledge easily through a variety of online publishing tools, such as weblogs and online forums. The web has thus become a valuable and abundant information source that has a significant effect on users’ lifestyles, especially their purchasing behavior [3]. Fast development of users of web has offered climb to e-business applications. The origin of recommenders can be traced back to methods like approximation theory, cognitive science, information retrieval and management science [4]. There are many benefits of having a recommender system like cross-selling, personalization, keeping the customers informed and customer retention. The websites that use recommenders are Amazon, MovieLens, eBay, CDNow, MovieFinder etc. In collaborative filtering approach, the system recommends new items to the user by analyzing items purchased by similar users (Amazon.com) [4]. In all cases, the main challenge to building an efficient recommender to face is to process a large amount of data. Processing this much data “semantic web mining” is one of the solution.
Collaborative filtering has many challenges like cold start problem, sparsity problem and over specialization problem.
Collaborative filtering has a major issue called cold start that is how to recommend new user. Cold start problem is that the recommenders cannot draw inferences for users or items for which it does not have sufficient information [4].
The rest of the paper is organized as follows. Section 2 discusses semantic web mining deeply with semantic web and web mining. Section 3 then describes Recommender system and collaborative filtering technique along with its limitation. Section 4 also discusses its related work. Section 5 concludes the whole paper.
II. SEMANTIC WEB MINING
A. Semantic Web
Volume 2, Issue 2, 2015
14 Figure 1: layers of semantic web [1]
As figure 1 describes at the bottom is XML that is suitable for sending documents across web. XML allows writing structured web documents with user-defined vocabulary. RDF is a basic data model, similar to entity-relationship model, to write simple statements about Web objects (resources). The RDF data model does not rely on XML, but RDF has an XML-based syntax. RDF Schema provides modeling primitives for organizing Web objects into hierarchies. Key primitives of web objects are classes, properties, subclass, sub-property relationships, and domain restrictions. It can be viewed as a primitive language for writing ontologies. The Proof layer involves the actual deductive process as well as the representation of proofs in Web languages (from lower levels) and proof validation. The Logic layer is used to enhance the ontology language and to allow the writing of application-specific declarative knowledge. Finally, the Trust layer will emerge through the use of digital signatures andother kinds of knowledge, based on recommendations by trusted clients [21].
B. Web Mining
“Web Mining is the utilization of information mining systems to the content, structure and usage of Web assets.". It is consequently "the nontrivial methodology of recognizing substantial, already obscure, and conceivably valuable examples" in the huge amount of these Web data, patterns are formed which describes them in concise form and manageable orders of magnitude. Web mining is an invaluable help in the transformation from human-understandable content to machine-understandable semantics.
Figure 2: categories of web mining
a) Web Content Mining - Web content mining is the phenomena of extracting information from the contents of web
documents. It may consist of variety of text, multimedia, or structured records such as lists and tables.
b) Web StructureMining Web structure mining is the process of discovering structure information from the web which
can be further divided into two kinds based on the kind of structure information used i.e Hyperlinks and Document Structure.
c) Web UsageMining - Analyses the user’s clicks from Web server, how the users of websites interact with web site,
the web pages visited, the order of visit, timestamps of visits and durations of them.
C. Semantic Web Mining
15 increasing amount of database, searching exact information directs the attention towards the semantic web mining. Semantic Web Mining aims at combining the two areas Semantic Web and Web Mining.
III. RECOMMENDER SYSTEM
With the expeditious development of the World Wide Web, people can now get and share knowledge easily through many different online tools, such as online forums and websites. Recommender systems have become extremely common in recent years, and are used in a variety of applications like music, news, books, research articles, search queries, social tags, and many products. The web has thus become a valuable and abundant information source that has a significant effect on users’ lifestyles, especially their purchasing behaviour [27]. Number of Internet users depends on information redeemed from the web to make their purchasing decisions. Even with the support of search engines, the number of retrieved documents is sometimes too large for users to obtain the desired information [25].
Collaborative Filtering Technique
Technique that is the most mature and most widely used for RS is collaborative filtering (CF). It relies only on opinions explicitly delivered by the users on items. The system recommends to the targeted customer products (or people), which have been evaluated by other people, whose interests are similar to the interest of the targeted user. Collaborative filtering explores techniques for matching people with similar interests and making recommendations on this basis.
Typically, the workflow of a collaborative filtering system is:
a) A user gives his or her preferences by rating items of the system. These ratings can be viewed as a rough representation of the user's interest for the comparing space.
b) The system matches this user’s ratings with other users’ and finds the people with most “similar” tastes.
c) With similar users, the system recommends items that the similar users have rated highly but not yet being rated by this user.
Limitations
Collaborative filtering technique has some challenges as follow:
a) Cold start problem: cold start problem comes when a new user or item just enters the system. There are three kinds of cold start problems are: new user problem, new item problem and new system problem. It is very difficult to provide recommendation in case of new user because we have very less information about user.
b) Sparsity problem: Sparsity has great influence on the quality of recommendation. Most of the users do not rate most of the items and the availability ratings are sparce it is the main reason of data sparsity.
c) Scalability: Scalability is the property of system indicates its ability to handle increasing amount of information on web in well manner. With vast growth in information over internet, it is clear that the recommender systems are having so much data and thus it is a great challenge to handle it with growing demand.
d) Over specialization: over specialization problem means users are restricted to getting recommendations which is look like to those already defined in their profiles in some cases. It prevents user from getting new items and other options.
IV. RELATED WORK
In 2014 the researcher Blerina Lika, Kostas Kolomvatsos, Stathes Hadjiefthymiades in Greese proposed a technique which uses a demographic Data for solving a new user problem called cold start problem in recommender system. Here they developed three-phase technique in order to provide predictions for new users. They develop a mechanism that takes demographic data and based on similarity techniques find the user’s ‘neighbors’ [27].
The research in Federica Cena, Silivia Likevec, Francesco Osbome describes an approach to vertical propagation of user interest in an ontology model. With the starting they use ontology as their input and decide to employ an ontology based model where user interests are stored. Each object or node of the ontology has their interest value from the feedback of user. They calculate propagated interest value from the nearest node, during that the algorithm traverses vertically the ontology graph and the interest is incremented. And finally from this interest value they recommend the new user [10].
In proposed recommendation method as suggested by Chien Chin Chen, Yu-Hao Wan, Meng-Chieh Chung, Yu-Chun Sun for cold start new users, which integrates trustand distrust networks with the model-based approach in two stages: the model construction stage and the recommendation stage. Firstly they use the set of experienced users (i.e., non-cold-start users) to construct a user model. Then do partition of experienced users into different clusters. The authors use PageRank algorithm for identifying trustworthy cluster and use implicit list of experts for recommending the new user [3].
Damien Poirier, Franc¸oise Fessant and Isabelle Tellier proposed a method to solve the cold start using semantic web mining. They concern the analysis of textual data. This task consists in obtaining a user-item-rating matrix. After the acquisition of data done, NLP treatments are applied on classification task. Once a large number of user-item-review triplets are stocked, the opinion classification task can be applied in order to infer a rating for each review. After the building of the user-item-rating matrix, collaborative filtering can be applied and recommendations can be done [13].
Volume 2, Issue 2, 2015
16 in the movie recommendation domain. Ontology modeling provides a good approach for feature extraction. Neural networks can help percept a user’s preference and predict a movie rate from the user’s perspective. By applying this as a base it can solve the cold start problem [12].
Omar NOUALI, Amokrane BELLOUI proposed a new filtering system which consists of a collaborative module, a semantic module and a Prediction module. They explored the possibility of using a graph of distance where the nodes represent the users or the resources and the edges the distance (with several parameters: collaborative similarity, semantic similarity, number of common evaluations, social distance, etc.) between two users or two resources [15].
In their research Javad Basiri, Azadeh Shakery, Behzad Moshiri, Morteza Zi Hayat proposed a hybrid approach in order to improve the prediction accuracy of the existing recommender systems in the cold-start (new user) condition. In the proposed approach, CF (collaborative Filtering), CBF (content-based-Filtering), demographic-based method, fusion of CF and demographic and fusion of CF and CBF classifiers has been used as the input strategies. For each instance, the results of these learned classifiers are fused by the optimistic exponential class of OWA (ordered weighted averaging) technique [9].
Hridya Sobhanam, A.K.Mariappan proposed a method using association rule mining and clustering to solve the cold start problem. . Here, they use the MovieLens dataset from groupLens. They combine two existing approach in sequential manner. First they apply association rule technique to expand the user profile. After that they apply k-mean clustering on the rating matrix of user’s rating, the object and cluster both are represented using fuzzy set theory [4].
Honey Jindal and Sandeep Kumar Singh developed a hybrid recommendation system using collaborative and content based filtering to solve the cold start problem. The system worked in two phases online and offline phase using movie rating dataset. In offline phase they construct a rating matrix from the given dataset and create similar user clusters. In online phase extract users’ demographic information from registration details. Search the cluster through the rules and after searching cluster, identify the users in the cluster. After that, identify the movies rated by the users in that cluster. Finally take average of those movies rating and give a list of recommended movies [20].
V. CONCLUSION
This paper summarizes the current techniques to solve or deal with cold start problem in recommendation system and how to deal with this problem using semantic knowledge. First it introduces semantic web and recommender system and its importance in today’s world and in real application. Than it summarizes some of recommendation technique like collaborative filtering, content-based filtering, and demographic, knowledge-based, hybrid approach and limitation of recommendation technique. A survey on different papers which already tried to solve this problem is done.
VI. REFERENCE
[1] Mohammad-Hossein, Nadimi-Shahraki and Mozhde Bahadorpour, “Cold-start Problem in Collaborative Recommender Systems: Efficient Methods Based on Ask-to-rate Technique”, Journal of Computing and Information Technology-CTT 22, Faculty of Computer Engineering, Najafabad branch, Islamic Azad University, Najafabad, Iran, 2014.
[2] Chien Chin Chen, Yu-Hao Wan, Meng-Chieh Chung, Yu-Chun Sun, “An effective recommendation method for cold start new users using trust and distrust networks”, ELSEVIER, Department of Information Management, National Taiwan University, Taiwan, 2013.
[3] Hridya Sobhanam, A.K.Mariappan, “Addressing cold start problem in recommender systems using association rules and clustering technique”, International Conference on Computer Communication and Informatics, Chennai, India, Jan, Coimbatore, INDIA, 2013.
[4] Lalita Sharma, Anju Gera, “A Survey of Recommendation System: Research Challenges”, International Journal of Engineering Trends and Technology (IJETT), Volume 4, Issue 5, M-Tech Scholar & Department of Computer Engineering, M.D. University, BSAITM, Faridabad, India, May 2013.
[5] Meng Chen, Cheng Yang, Jiechao Chen,Peng Yi, “A Method to Solve Cold-Start Problem in Recommendation System based on Social Network Sub-community and Ontology Decision Model”, 3rd International Conference on Multimedia Technology ICMT, 2013.
[6] Hui Lia, Xinyue Liub, A Personalized Recommendation System Combining User Clustering and Association Rules with Multiple Minimum Supports in (2nd International Conference on Future Computers in Education Lecture Notes in Information Technology, Vols.23-24), 2012.
[7] C.S.Bhatia, Dr. Suresh Jain “Semantic Web Mining: Using Ontology Learning and Grammatical Rule Inference Technique”, IEEE, Department of computer engineering, Mewar University, Chittorgarh, 2011.
[8] Javed Basiri, Azadeh Shakery, Behzad Moshiri, Morteza Zi Hayat, ”Alleviating the Cold-Start Problem of Recommender Systems Using a New Hybrid Approach”, 5th International Symposium on Telecommunications, IEEE, 2010.
[9] Federica Cena, Silivia Likevec, Francesco Osbome, “Propagating User Interests in Ontology-Based User Model*”, Springer-Verlag Berlin Heidelberg, Department of Informatica, University di Torino, Italy, 2011.
17
[11]Yong Deng,Zhonghai Wu, CongTang, Huayou Si, Hu Xiong1,Zhong Chen, “A Hybrid Movie Recommender Based on Ontology and Neural Networks”, IEEE/ACM International Conference on Green Computing and Communications & IEEE/ACM International Conference on Cyber, Physical and social computing, 2010.
[12] Damien Poirier, Francoise fessant, Isabelle Tellier, “Reducing the Cold-Start Problem in Content Recommendation Through Opinion Classification”, IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, 2010.
[13] Hakan Yilmaz, “Using Ontology Based Web Usage Mining And Object Clustering For Recommendation”, The Graduate School Of Natural And Applied Sciences Of Middle East Technical University, May-2010.
[14] Omar NOUALI, Amokrane BELLOUI , “Using semantic web to reduce the cold-start problems in recommendation systems”, IEEE, Centre de Recherche sur l’Information Scientifique et Technique, CERIST, Institut Nationald’Informatique, INI, 2009.
[15] Xu, Yue, Gavin Shaw, and Yuefeng Li. "Concise representations for association rules in multi-level datasets." Journal of Systems Science and Systems Engineering 18.1: 53-70, 2009.
[16] Shaw, G., Xu, Y., Geva, S.: Eliminating ssociation Rules in Multilevel Datasets. In: In 4th International Conference on Data Mining (DMIN’08). pp. 313–319. Las Vegas, USA, July 2008.
[17] Shaw, G., Xu, Y., Geva, S.: Extracting Non-Redundant Approximate Rules from Multi-Level Datasets. In: In 20th IEEE International Conference on Tools with Artificial Intelligence (ICTAI), 340. Dayton, Ohio, USA, November, 2008. [18] Abd-Elrahman Elsayed, Samhaa R. El-Beltagy, Mahmoud Rafea, Osman Hegazy, “Applying data mining for ontology
building”, The Central Laboratory for Agricultural Expert Systems, Giza, Egypt, 2007.
[19] Gerd Stumme, Andreas Hotho, Bettina Berendt,“ Semantic Web Mining State of the Art and Future Directions”, ESELIVER, Knowledge and Data Engineering Group, University of Kassel, D-34121 Kassel, 2006.
[20] Semantic web premier, 2004.
[21] Qing Li, Byeong Man Kim, “Clustering approach for Hybrid Recommender System”, Proceedings of the IEEE/WIC International Conference on Web Intelligence, IEEE,2003.
[22] Middleton, S.E., Alani, H., Shadbolt, N.R. oure, and D.C.D.: Exploiting Synergy between Ontologies and Recommender Systems. In: The Semantic Web Workshop, World Wide Web Conference pp. 41–50. Hawaii, USA May 2002.
[23] Gerd Stumme, Andreas Hotho, “Usage Mining for and on the Semantic Web”, Institute for Applied Computer Science and Formal Description Methods (AIFB), University of Karlsruhe D-76128 Karlsruhe, Germany, 2002.
[24] Chen, L., & Sycara, K., “WebMate: A personal agent for browsing and Searching”, In Proceedings of the second international conference on autonomous agents (pp. 9–13), 1988.
[25] Khoi-Nguyen Tran, “Semantic Web Mining”, School of Computer Science, the Australian National University, Canberra, ACT 0200, Australia.