User-Item Recommendation System (UIRS) Using Collaborative Filtering

(1)

User-Item Recommendation System (UIRS) Using Collaborative Filtering

Neha Verma¹, Devanand² and Bhavna Arora³

1M.Tech Student, ^2,3Ph. D,

1,2,3

Department of Computer Science & Information Technology, Central University of Jammu, Jammu, India.

Abstract

The orthodox merchandise conduct has been upgraded by the e-commerce industry.

Presently, people with busy routine, prefer online shopping instead of visiting physically to buy the products. In the modern era, topical prediction concerning a large amount of information regarding online shopping products is generating due to which users face the difficulty in finding the relevant information of products and services matching their tastes and preferences. A recommendation system is a powerful tool used by the e- commerce industry in order to assist e-buyers to showcase their products in an effortless and rapid manner. This paper initially discussed various collaborative filtering techniques. A proposed recommendation system is designed by using memory-based collaborative filtering techniques that are user-based and item-based filtering recommendation techniques. The proposed work focuses on using three items namely user name, item name, and item rating. The proposed model uses user rating data for the filtering of products and users. The system calculates the similarities between users and items using the Pearson correlation similarity measures for giving recommendations.

Keywords: e-commerce, collaborative filtering, model-based collaborative filtering, memory-based collaborative filtering, recommendation system.

1. Introduction

In the modern digital era, the rapid progress of e-shopping items over the internet has gained a tremendous popularity. There are thousands of items available on the web from which a customer can choose according to his choice. But online sellers required a system through which they are able to promote and recommend their items in a better way according to user requirements and taste. This kind of system is known as Recommendation Systems (RS) [1]. It is the process of removing redundant information from large sets of data in order to collect useful knowledge that can be further used as a recommendation to the user. RS is the subclass of Information Retrieval (IR), a system that seeks to predict the "rating" a user would give to an item. It is basically data filtering tool that makes use of algorithms & data to recommend the most relevant item to a particular user. In simple terms, RS is an automated form of a salesman in a shop that will help you to look for items according to your taste and choice. These are well trained in cross-selling and up-selling. The ability of these systems to recommend personalized content, based on past behavior is incredible [1].

(2)

Figure 1. Components of Recommendation System [1]

RS reconstructs the socializing of the user with the internet while keeping suitability, modernity, heterogeneity intact. It brings user delight and gives them a reason to keep returning to the website. It transforms online shopping point-of-view of the non- experienced users, to find any item by analyzing their behavior to know what he/she likes to form a large cluster of items and also enhances the e-commerce sale in three ways i.e.

web crawlers into buyers, loyalty, cross-selling. There are mainly three techniques of RS stated as under: [1]

 Traditional Techniques

 Content-Based

 Demographic Method

 Collaborative Filtering (CF)

 Knowledge-Based

 Modern Techniques

 Context-Aware

 Semantic-Based

 Cross-Domain Based

 Peer to Peer

 Cross-Lingual

 Hybrid Techniques

 Weighted Method

 Switching Method

 Mixed-Method

 Feature Combination

 Feature Augmentation

 Cascade Method

 Meta-Level [1]

2. Collaborative Filtering

Collaborative Filtering (CF) is one of the most efficient traditional recommendation approaches, often used to construct personalized suggestions on the internet. CF algorithms are used to analyze self-executing guesses about a user’s interest by assembling liking from the past rating values expressed by the like-minded users. [14]. It

(3)

further categorized into two types: i.e. Model-based and Memory-based CF techniques as shown in figure 2 below:

Figure 2. Collaborative Filtering paradigm [1]

2.1. Model-Based (MDB) Collaborative Filtering

This technique makes use of system learns algorithms & data mining algorithms to conquer the scalability challenge of memory-based filtering. Also called Latent Factor (LF) or Matrix Factorization (MF) technique [15]. MDB can be categorized as follows:

2.1.1. Clustering Technique: It examines the CF issue as a classification issue by keeping similar users and items in the same class. A cluster is a group of data items that are like to each other in one cluster and unlike items in other clusters. Also known as data segmentation.

2.1.2. Association Technique: It uses association rule discovery algorithms to uncover interrelation between co-purchased items & item recommendations are generated on the basis of the durability of relation between items. In simple terms, we need to discover which item already purchases concurrently to discover associations [16].

2.1.3. Bayesian Network: It is a probabilistic method for conventional training. It uses the to show association between user & item. They can be used for an immense scale of quest counting prediction, automated insight, anomaly detection, diagnostics, time series prediction, reasoning & decision making under variability [1].

2.1.4 Neural Network (NN): It is a series of algorithms that attempt to identify the key link in a dataset through a procedure that imitates, the way the human brain works.

Learning methods of NN are Newton’s & Quasi-Newton method, Levenberg-Marquardt

& Back-propagation algorithm [1], [17].

2.2. Memory-Based (MMB) Collaborative Filtering

This technique memorizes the user-item rating matrix (RM) & uses the full rating database to discover a resemblance between user and item. MMB is easy to execute but faces memory challenge as it needs a huge space to store full-fledge RM. MMB algorithms are lazy-learner & susceptible to ductile challenge. It can be further categorized into two methods [14]:

(4)

2.2.1. User-Based (UB): This filtering technique works on the opinion of like-minded users in the past are shared similar opinions in the past are probably to measure the same in the future. UB uses a user-vector to calculate the rating score. it finds the resemblance of rating scores between the items.

2.2.2. Item-Based (IB): This filtering technique can be applied to sizable datasets. The IB method examines items with the same similar features. IB is studied by analyzing how a user has rated that specific item. In this item-vector is used for generating recommendations.

3. Related Work

In [2], the author proposes a web-based e-learning system, categorized into two types 1) between customer and machine 2) machine and internet. It is build using CF and clustering methods. It looks for the relevant data on the internet, personalizes and tells content based on the machine deliberation. Also, observe user desire.

In [3], the author discusses a theoretical study of using collaborative filtering techniques for music RS. This work focus on 2 approaches of CF a) user-based b) item- based recommendations. For the observational purpose, the paper investigates different metrics to find the resemblance between users and items such as Euclidean distance, Pearson correlation, and cosine metric, etc. Also, distinguish different ranking metrics that illustrate the potency of the RS. Different variables like similarity measure, scoring function are used to raise the efficacy of RS. The results are obtained by fixing the variables q & α. The result obtained are for non-trivial α=0.15 & q=3 (item-based), α=0.3 and q=5 (user-based).

In another research [4], the author proposes a sentiment-based rating prediction method (RPS) to enhance prediction precision in RS. Initially calculate each user’s social sentiment on items. It also, considers the user’s own sentimental feature, interpersonal sentimental impact and item reputation to make a precise rating prediction. The existing model collaborate user sentiment analogy, interpersonal sentiment impact, and item reputation analogy into a consolidated matrix factorization structure to attain the rating prediction task. In order to find efficacious hints from reviews & predict social users’

ratings. The existing work initially, extracts item attributes from user reviews & then it develops the technique of recognizing social users’ ideas with the help of sentiment-based rating prediction method (RPS).

The execution of a movie RS using CF algorithms in Apache Mahout has been discussed in [5]. The used dataset is attained from the Yahoo Research Web scope database. The model uses movie ratings as an input to provide a recommendation.

In [6], the work presents a movie RS based on user similarity metric & opinion mining.

Initially, it detects the type of opinions i.e. positive, negative or neutral for movies & then gives a top-k recommendation. The precision of classification is boosted using the NbSVM classifier.

A new CF recommendation algorithm based on dimensionality reduction and clustering techniques has been proposed in [7] using the k-means algorithm and Singular Value Decomposition (SVD) that is used to bunch similar users & minimize the dimensionality. This work is done in two stages a) offline model creation, in this stage, the k-means algorithm and SVD technique is used for assembling user ratings, minimizes the data dimensions & computes the similarities and b) online model utilization, here model makes accurate recommendations for a given active user.

In [8], here author develops a model using Object-Oriented Analysis & Design Methodology (OOADM), enhanced quick sort and cosine similarity algorithm in python

(5)

Django Framework. This enhanced system is executed using a real-time, cloud-hosted NoSQL database called FireBase.

In [9], the author proposes a CF model using ordinal user feedback instead of a common numerical value. This work is based on a point wise ordinal model, can enclose most CF algorithms & customizing those algorithms, formulated to handle numerical values into ordinal values.

In [10], the author describes the consolidation of reputation models into RS to amplify the precision of recommendations. It divides the execution of reputation & recommender systems for universality. This work used a reputation system to strengthen the precision of top-n recommendations.

In [11], the author presented a new algorithm on case-based reasoning for CF-based web pages RS. To prove the potency of this work, an observation was bringing out with multifarious datasets overlaying 2370 web pages examine by 77 different users. Here user profile is developed using eight habitual& two content-based attributes. For further optimization of RS weighted association rule mining techniques was exert along with CBR.

In [12], the author presents a quick response system (QRS) of fashion e-business to raise the productivity of retail using the CF technique. This QRS permit the organizations to see user needs invariably& construct a scheme promptly so that they could restrain the nonessential stocked items. This system can assist in enhanced production in an item by projection, suggestions & decrease goods in stock.

In [13], the author presents an improved user-based CF algorithm using user' latent relationships weighting (ULRW) for the rating prediction process. It also deals with the sparsity data challenge to enhance the prediction precision. This algorithm extracts ULR with the least figuring cost by column-sampling resembling the SVD technique. For the predicted rating algorithm, the Pearson correlation coefficient is used to find ULRW values.

In [14], the author designs a CF-based Personalized Top-k RS for Housing (CFP- TR4H) algorithm & a personalized RS based on CFP-TR4H based on the Nanjing, a city in China. This model proposes a “space vector similarity-based” CF technique, attempt to deconstruct composite items following to space element & confer a sensible impression for each item of an element. It fills the item impression matrix to evade the data scarcity challenge.

4. Proposed Work

The Recommendation System (RS) is a tool that is used to recommend items according to the user taste. This proposed work focuses to design a system to recommend items to the users on the basis of user ratings given to particular items including user personal interest and taste. This recommendation model is built using the CF technique. In the proposed research methodology, a dataset of 50 users and 25 items has been manually created which contains user name, item name, item rating with the help of a data dictionary in python. The proposed work focus on the RS approach is to use user rating data for the filtering of items and users. Initially, from input dataset from the target data is selected by removing data sparsity.

Secondly, the resemblance between users and items is calculated using the Pearson correlation similarity measure method. Finally, the recommendation is done on the base of a collaborative filtering technique. If a user already exists in the dataset recommendations are generated on user-based filtering technique. In the case of new user top-n recommendations are generated on item-based filtering technique. The algorithm for the proposed methodology is as follows:

(6)

Algorithm of Proposed Work:

Step 1: Start.

Step 2: Input dataset.

Step 3: If (user & Item Exists)

Find similarities between users Display recommendations Else

Find similarities between items Display top-n recommendations Step 4: Stop.

5. Experimental Outcome & Discussion

In this work manual dataset is created with 50 users and 25 items (mobile brands) containing user name, item name, item rating. The mobile brands parameters (twenty- five) that have been taken up for recommendation in this work has been shown below:

 Acer, Apple,Asus

 BlackBerry

 Gionee, Google

 HTC, Haier, Huawei

 OnePlus, Oppo

 Samsung, Sony, Spice

 LAVA, LG, Lenovo

 Microsoft, Motorola, Micromax

 Intex

 Vivo

 ZTE

 Nokia

 Xiaomi

The result of the model has been shown in this section. The working of the model is shown in the following steps:

(7)

Figure 3. Dataset used in the proposed model

The above figure 3 shows the dataset used in this research work. The dataset created manually with the help of a Python data dictionary containing 50 users and 25 items (mobile brand name).

Example 1:

'Loura':{'Acer':4.0,'ZTE':3.5,'Sony':4.0,'Asus':1.0,'BlackBerry':4.0,'Samsung':4.5,'Opp o':5.0,'HTC':4.5,'Micromax':3.5,'Gionee':4.2,'Nokia':3.6,'Haier':4.5,'Huawei':3.0,'LG':3.5,' LAVA':3.0,'Xiaomi':5.0,'Intex':4.0,'Spice':4.0,'OnePlus':3.5,'Lenovo':2.5}

Example 2:

'Grayce':{'Apple':4.5,'Asus':4.0,'BlackBerry':2.5,'OnePlus':4.5,'Oppo':3.5,'Samsung':4.

0,'Sony':5.0,'Spice':3.5,'LAVA':3.0,'Microsoft':2.5,'Intex':3.5,'Motorola':5.0,'ZTE':4.0,'Mic romax':3.3,LG':4.0,'Gionee':3.2,'Google':2.0,'HTC':3.0,'Haier':3.5,'Huawei':3.0,

'Vivo':3.4}

(8)

Figure 4. Similarity Measure of one with every other

The above figure 4 shows the similarity scores of one user with every other user, higher the similarity score higher the similarity between users. Resemblance “r” between users is calculated using Pearson Correlation Similarity Measure.

(1)

Where X,Y are two variables & r is resemblance (Similarity) score,

∑X is sum of x scores & ∑Y is sum of y scores,

∑X² is sum of squared x scores&∑Y² is sum of squared y scores,

∑XY is sum of the multiplication of paired scores,

Figure 5. Recommendation for Loura

(9)

Figure 6. Recommendation for Grayce

Above figures 5 & 6 shows the recommendation results (Item Name (mobile brand name) + rating score) for the users.

6. Result & Analysis

This section shows the analysis of the used dataset and evaluation metrics of the proposed model.

6.1. Analysis of Used Dataset

The used dataset has been created manually as shown in Example 1 and 2 above with the help of a Python data dictionary consisting of 50 users and 25 items (mobile brands).

Sample data set is shown in figure 7 below.

Figure 7. Sample dataset of the used dataset

(10)

Figure 8. Graphical representation of sample dataset

For a better understanding of the dataset, the above figure 7 shows the sample dataset in tabular form & figure 8 illustrates its graphical representation.

6.2. Evaluation Metrics

Mean Absolute Error (MAE) & Root mean squared error (RMSE) are two of the most ordinary evaluation methods, used to calculate accuracy mentioned as follows in equation 2 and 3:

 Mean Absolute Error: MAE calculates the average weight of the errors of prediction set, without examining their direction. It’s the average over the test dataset of the absolute dispute between predicted and actual value where all individual dispute has equal weight.

(2)

 Root Mean Square Error: RMSE is a quadratic value act that also calculates the average weight of the error. It’s the square root of the average of squared dispute between predicted and actual value.

(3) Where,

 is the real rating value of user u to item i

 is the predicted rating value

 denotes the number of user-item pairs in the dataset [14].

The result of the proposed model is presented in the following table 1. The error rate is quite low. Smaller error rate signifies that the model is more precise.

(11)

Table 1. Result of Working Model

S. No. Error Value

1. Mean Absolute Error 0.0992

2. Root Mean Square

Error

0.246

7. Conclusion & Future Scope

Nowadays, Recommendation systems are playing an important role in the e-commerce industry. There is an issue faced by the e-commerce industry to give personalized recommendations of items as per user taste in order to increase in the sale of items in a quick and time-saving manner. This work proposes a model to overcome the above problem and would also be able to provide a recommendation based on rating of items previously given by the user. Collaborative Filtering is one of the efficient techniques for the recommendation of products and services. In this paper, a brief explanation of various collaborative filtering techniques is given. The brief description of the proposed recommendation system is also given along with its result and description. From the results of the proposed work, it is clear that the error rate is very low.

In future, the work can be extended for a bigger dataset that could include a greater number of users and items. Moreover, a model can be trained using specific techniques and parameters can also be increased to enhance the output. For better recommendations in order to avoid scalability issues, a hybrid technique can also be used by combining different algorithms.

References

[1] N. Verma, Devanand, B. Arora “Experimental Analysis of Recommendation System in e-Commerce”, International Journal of Innovative Technology and Exploring Engineering (IJITEE), vol. 8, no. 3, (2019), pp. 121–127.

[2] T. Y. Tang and G. Mccalla, “Smart Recommendation for an Evolving E-Learning System: Architecture and Experiment”, International Journal on E-Learning, vol. 4, no. 1, (2005), pp. 105-129.

[3] E. Shakirova, “Collaborative Filtering for Music Recommender System”, IEEE, (2017), pp. 548–550.

[4] X. Lei, X. Qian, and G. Zhao, “Rating Prediction based on Social Sentiment from Textual Reviews”, IEEE transactions on multimedia, vol. 9210, (2017).

[5] C. M. Wu, D. Garg, and U. Bhandary, “Movie Recommendation System Using Collaborative Filtering”, IEEE 9th International Conference on Software Engineering and Service Science (ICSESS), (2018), pp. 11–15.

[6] R. Nagamanjula and A. Pethalakshmi, “A Novel Scheme for Movie Recommendation System using User Similarity and Opinion Mining”, International Journal of Innovative Technology and Exploring Engineering (IJITEE), vol. 8, no. Issue-4S2, (2019), pp. 316–322.

[7] H. Zarzour, Z. Al-sharif, M. Al-ayyoub, and Y. Jararweh, “Algorithm Based on Dimensionality Reduction and Clustering Techniques”, 9th International Conference on Information and Communication Systems (ICICS) - IEEE, (2018), pp. 1–5.

[8] E. Uko, B. O., and P. O., “An Improved Online Book Recommender System using Collaborative Filtering Algorithm”, International Journal of Computer Applications, vol. 179, no. 46, (2018), pp. 41–48.

[9] Y. Koren and J. Sill, “OrdRec: An Ordinal Model for Predicting Personalized Item Rating”, ACM Conference Recommendation System, (2011), pp. 117.

(12)

[10] A. Abdel-Hafez, X. Tang, N. Tian, and Y. Xu, “A Reputation-Enhanced Recommender System”, Springer International Publishing Switzerland, (2014), pp.

185–198.

[11] J. Bhavithra and A. Saradha; “Personalized web page recommendation using case- based clustering and weighted association rule mining”, Springer Science Business Media, LLC, part of Springer Nature, vol. 0123456789, (2018).

[12] K. Chung, C. Song, K. Rim, and J. Lee, “Quick Response System Using Collaborative Filtering on Fashion E-Business”, Springer-Verlag Berlin Heidelberg, (2010), pp. 54–63.

[13] T. T. To and S. Puntheeranurak, “An Enhanced User-Based Collaborative Filtering Recommendation System Using the Users Latent Relationships Weighting Utilization”, Springer-Verlag Berlin Heidelb., (2014), pp. 153–163.

[14] L. Wang, X. Hu, J. Wei, and X. Cui, “A Collaborative Filtering Based Personalized TOP-K Recommender System for Housing”, Proceedings of the International Conference of MCSA, AISC, Springer-Verlag Berlin Heidelb., (2013), pp. 461–466.

[15] X. Kong and R. Chen, “A Survey of Collaborative Filtering-Based Recommender Systems: From Traditional Methods to Hybrid Methods Based on Social Networks”, IEEE Access, Translations and content mining, vol. 6, (2018), pp. 64301–64320.

[16] R. Sharma, D. Gopalani, and Y. Meena, “Collaborative Filtering – Based Recommender System : Approaches and Research Challenges”, 3rd IEEE International Conference on "Computational Intelligence and Communication Technology, (2017), pp. 1–6.

[17] M. H. Mohamed, M. H. Khafagy, and M. H. Ibrahim, “Recommender Systems Challenges and Solutions Survey”, International Conference on Innovative Trends in Computer Engineering, Aswan, Egypt - IEEE, (2019), pp. 149–155.

[18] S. Thakial and B. Arora, “Neural Network-based Prediction Model for Job Applicants”, Journal of Computational and Theoretical Nanoscience, vol. 16, no. 9, (2019), pp. 3867-3873.