Web User Behavior Analysis Using Improved Naïve Bayes Prediction Algorithm

(1)

ISSN: 2231-2803

http://www.ijcttjournal.org

Page107

Web User Behavior Analysis Using Improved

Naïve Bayes Prediction Algorithm

B.Harindra Varma M.Tech(C.S.E)

Gudlavalleru Engineering College, Gudlavalleru K.Ashok Reddy Assistant Professor(C.S.E) Gudlavalleru Engineering College, Gudlavalleru S.Narayana

Associate Professor (C.S.E), Gudlavalleru Engineering College

Abstract – With the continued growth and proliferation of Web services and Web based information systems, the volumes of user data have reached astronomical proportions. Analyzing such data using Web Usage Mining can help to determine the visiting interests or needs of the web user. As web log is incremental in nature, it becomes a crucial issue to predict exactly the ways how users browse websites. It is necessary for web miners to use predictive mining techniques to filter the unwanted categories for reducing the operational scope. Markov models& its variations have also been used to analyze web navigation behavior of users. A user's web link transition on a particular website can be modeled using first, second-order or higher-order Markov models and can be used to make predictions regarding future navigation and to personalize the web page for an individual user. All higher order Markov model holds the promise of achieving higher prediction accuracies, improved coverage than any single-order Markov model but holds high state space complexity. Hence a Hybrid Markov Model is required to improve the operation performance and prediction accuracy significantly. Markov model is assumed to be a probability model by which users’ browsing behaviors can be predicted at category level. Bayesian theorem can also be applied to present and infer users’ browsing behaviors at webpage level. In this research, Markov models and Bayesian theorem are combined and a two-level prediction model is designed. By the Markov Model, the system can effectively filter the possible category of the websites and Bayesian theorem will help to predict websites accuracy. The experiments will show that our provided model has noble hit ratio for prediction.

Keywords – Webusage,Hidden Markov, Bayes,Data

Mining.

I.

INTRODUCTION

The Web is a huge, explosive, diverse,

dynamic and mostly unstructured data repository, which supplies incredible amount of information, and also raises the complexity of how to deal with the information from the different perspectives of view, users, web service providers, business analysts. The users want to have the

effective search tools to find relevant information easily and precisely. The Web service providers want to find the way to predict the users’ behaviors and personalize information to reduce the traffic load and design the Website suited for the different group of users. The business analysts want to have tools to learn the user/consumers’ needs. All of them are expecting tools or techniques to help them satisfy their demands and/or solve the problems encountered on the Web. Therefore, Web mining becomes a popular active area and is taken as the research topic for this investigation. Web Usage Mining is the application of data mining techniques to discover interesting usage patterns from Web data, in order to understand and better serve the needs of Web-based applications.

Here our task is related to the web usage

mining which basically Consist task related to the use of web where the access of the web will considered and the navigation pattern and the prediction operation will performed in the mining of this kind we will use the database in the form of the web log files and we will generate the results on the basis of the database given.Markov models have been used for studying and understanding stochastic processes, and were shown to be well-suited for modeling and predicting a user’s browsing behavior on a web-site.In general, the input for these problems is the sequence of web-pages that were accessed by a user and the goal is to build Markov models[2] that can be used to model and predict the web-page that the user will most likely access next[3]. In many applications, first-order Markov models are not very accurate in predicting the user’s browsing behavior, since these models do not look far into the past to correctly discriminate the different observed patterns.As a result, higher-order models are often used. Unfortunately, these higher-order models have a number of limitations associated with high state-space complexity, reduced coverage, and sometimes even worse prediction accuracy.One method proposed toovercome the problem is the clustering and cloning to duplicate the state corresponding to page that require a longer history to understand the choice of link that users made.Initially when the web log is not available means the web site is newly launched the prediction or the navigation decision will mad on the page rank our page rank strategy will also used to resolve the ambiguity of the model.Our model will use the basic strategy for the

(2)

ISSN: 2231-2803

http://www.ijcttjournal.org

Page108

preparing the model is the page rank , and variable length markov model, the problem of ambiguity in the markov model will solve on the basis of the page rank and the page rank will also used in the initial stage when the web log file is not available.

Markov model have been used for studying and understanding stochastic processes, and well suited for

modeling and predicting a user’s browsing behavior on a web. In general, the input for these problems is the sequence of web pages that are accessed by a user and the goal is built Markov model that can be used to predict the web user usage behavior. The state space of the Markov model depends on the number of previous actions used in predicting the next action. The simplest Markov model predicts the next action by only looking at the last action performed by the user. In this model, also known as the first order Markov model, each action that can be performed by a user corresponds to a state in the model. A somewhat more complicated model computes the prediction by looking at the last two actions performed by the user. This is called the second order Markov model, and its states correspond to all possible pairs of action that can be performed in sequence. This approach is generalized to the nth order Markov model, which computes the prediction by looking at the last N actions performed by the user, leading to a state space that contains all possible sequences of N actions.

In most of the applications, the first-order Markov model has low accuracy in achieving right predictions, which is why extensions to higher order models are necessary. All higher order Markov model holds the promise of achieving higher prediction accuracies and improved coverage than any single-order Markov model, at the expense of a dramatic increase in the statespace complexity

II.

LITERATURE SURVEY

Myra Spiliopoulou [1] suggests applying Web usage mining to website evaluation to determine needed modifications, primarily to the site’s design of page content and link structure between pages. Eirinaki et al. [2] propose a method that incorporates link analysis, such as the page rank measure, into a Markov model in order to provide Web path recommendations. Schechter et al. [3] utilized a tree-based data structure that represents the collection of paths inferred from the log data to predict the next page access. Chen and Zhang [4] utilized a Prediction by Partial Match forest that restricts the roots to popular nodes; assuming that most user sessions start in popular pages, the branches having a Non popular page as their root are pruned. R. Walpole, R. Myers and S. Myers [5] proposed Bayesian theorem can be used to predict the most possible users’ next request.

The Hybrid Successive Markov Predictive Model HSMP has been used for investigation and understanding

stochastic process and it was to be well suited for modeling and predicting users browsing behavior in the Web log Scenario. In most of the applications, the first-order Markov model has low accuracy in achieving right predictions, which is why extensions to higher order models are necessary. All higher order Markov model holds the promise of achieving higher prediction accuracies and improved coverage than any single-order Markov model, at the expense of a dramatic increase in the state-space complexity. Hence, the authors proposes techniques for intelligently combining different order Markov models so that the resulting model has low state space complexity, improved prediction accuracy and retains the coverage of the all higher order Markov model.

Problems in Existing Work:

1) We propose a new two-tier prediction framework to improve prediction time. Such framework can accommodate various prediction models

2) We present an analysis study for Markov model and all-Kth model

3) We propose a new modified Markov model that handles the excess memory requirements in case of large data sets by reducing the number of paths during the training and testing phases.

4) We conduct extensive experiments on three benchmark data sets to study different aspects of the WPP using Markov model, modified Markov model, ARM, and all- Kth Markov model. Our analysis and results show that higher order Markov model produces better prediction accuracy.

III. PROPOSED SYSTEM

In this section, we propose another Improved variation of Markov model by reducing the number of paths in the model so that it can fit in the memory and predict faster[1]. Web prediction is perfomed on the following data :

(3)

ISSN: 2231-2803

http://www.ijcttjournal.org

Page109

HMM BASED BAYES APPROACH:

BAYESIAN CLASSIFICATION

Bayesian classifiers are statistical classifiers. They can predict class membership probabilities, such as the probability that a given tuple belongs to a particular class. Bayesian classification is based on Bayes theorem. Naïve Bayesian classifiers assume that the effect of an attribute value on a given class is independent of the values of the other attributes. This assumption is called class conditional independence. It is made to simplify the computations involved and, in this sense, is considered “naïve.” Bayesian belief networks are graphical models, which unlike naïve Bayesian classifiers, allow the representation of dependencies among subsets of attributes. Bayesian belief networks can also be used for classification.

Bayes’ Theorem

Bayes’ theorem is named after Thomas Bayes, a nonconformist English clergyman who did early work in probability and decision theory during the 18th century. Let X be a data tuple. In Bayesian terms, X is considered “evidence.” As usual, it is described by measurements made on a set of n attributes. Let H be some hypothesis, such as that the data tuple X belongs to a specified class C. For classification problems, we want to determine P(H/X), the probability that the hypothesis H holds given the “evidence” or observed data tuple X. In other words, we are looking for the probability that tuple X belongs to

class C, given that we know the attribute description of X. P(H/X) is the posterior probability, or a posteriori probability, of H conditioned on X. For example, suppose our world of data tuples is confined to customers described by the attributes age and income, respectively, and that X is a 35-year-old customer with an income of $40,000. Suppose that H is the hypothesis that our customer will buy a computer. Then P(H/X) reflects the probability that customer X will buy a computer given that we know the customer ’s age and income. In contrast, P(H) is the prior probability, or a priori probability, of H. For our example, this is the probability that any given customer will buy a computer, regardless of age, income, or any other information, for that matter. The posterior probability, P(H/X), is based on more information (e.g., customer information) than the prior probability, P(H), which is independent of X. Similarly, P(X/H) is the posterior probability of X conditioned on H. That is, it is the probability that a customer, X, is 35 years old and earns $40,000, given that we know the customer will buy a computer. P(X) is the prior probability ofX.Using our example, it is the probability that a person from our set of customers is 35 years old and earns $40,000. P(X/H), and P(X) may be estimated from the given data, as we shall see below. Bayes’ theorem is useful in that it provides a way of calculating the posterior probability, P(H/X), from P(H), P(X/H), and P(X). Bayes’ theorem is

Naïve Bayesian Classification

The naïve Bayesian classifier, or simple Bayesian classifier, works as follows:

1. Let D be a training set of tuples and their associated class labels. As usual, each tuple is represented by an n-dimensional attribute vector, X = (x1, x2, : : : , xn), depicting n measurements made on the tuple from n attributes, respectively, A1, A2, : : : , An.

2. Suppose that there are m classes, C1, C2, : : : , Cm. Given a tuple, X, the classifier will predict that X belongs to the class having the highest posterior probability, conditioned on X. That is, the naïve Bayesian classifier predicts that tuple X

(4)

ISSN: 2231-2803

http://www.ijcttjournal.org

Page110

belongs to the class Ci if and only if P(Ci/X) > P(Cj/X). Thus we maximize P(Ci/X). The classCi for which P(Ci/X) is maximized is called the maximum posteriori hypothesis. By Bayes’ theorem

3. As P(X) is constant for all classes, only P(X/Ci)P(Ci) need be maximized. If the class prior probabilities are not known, then it is commonly assumed that the classes are equally likely, that is, P(C1) = P(C2) = …….. = P(Cm). Given data sets with many attributes, it would be extremely computationally expensive to compute P(X/Ci). In order to reduce computation in evaluating P(X/Ci), the naive assumption of class conditional independence is made. This presumes that the values of the attributes are conditionally independent of one another, given the class label of the tuple (i.e., that there are no dependence relationships among the attributes). Thus,

We can easily estimate the probabilities P(x1/Ci), P(x2/Ci), : : : , P(xn/Ci) fromthe training tuples. Recall that here xk refers to the value of attribute Ak for tuple X. For each attribute, we look at whether the attribute is categorical or continuous valued.

Finally m prediction existing model is applied for classifying the rules[1].

IV. RESULTS

All experiments were performed with the configurations Intel(R) Core(TM)2 CPU 2.13GHz, 2 GB RAM, and the operating system platform is Microsoft Windows XP Professional (SP2). Existing results: Country: Texas -> 23.0 Florida -> 54.0 Illinois -> 24.0 Ontario -> 28.0 Washington -> 35.0 Oklahoma -> 53.0 California -> 29.0 Oregon -> 26.0 Alberta -> 41.0 Kentucky -> 49.0 North_Carolina -> 18.0 Georgia -> 26.0 Pennsylvania -> 24.0 Indiana -> 55.0 Virginia -> 25.0 Australia -> 27.0 Michigan -> 28.0 Ohio -> 28.0 Connecticut -> 17.0 Rhode_Island -> 41.0 New_York -> 26.0 United_Kingdom -> 22.0 Massachusetts -> 41.0 Saskatchewan -> 34.0 Idaho -> 60.0 Wisconsin -> 17.0 New_Jersey -> 45.0 Italy -> 37.0 South_Dakota -> 23.0 Louisiana -> 28.0 Vermont -> 44.0 Missouri -> 25.0 Mississippi -> 36.0 Netherlands -> 28.0 Kansas -> 28.0 Alaska -> 69.0 Minnesota -> 28.0 Colorado -> 26.0 Maryland -> 32.0 Utah -> 28.0 Nevada -> 27.0 Washington_D.C. -> 35.0 Wyoming -> 27.0 Arizona -> 41.0 New_Hampshire -> 53.0 South_Carolina -> 53.0 Delaware -> 49.0 Tennessee -> 25.0 Sweden -> 28.0 Afghanistan -> 36.0 Iowa -> 35.0 British_Columbia -> 53.0 Arkansas -> 25.0 Montana -> 41.0 France -> 26.0 Alabama -> 39.0 Kuwait -> 50.0

(5)

ISSN: 2231-2803

http://www.ijcttjournal.org

Page111

Finland -> 49.0 Switzerland -> 30.0 New_Zealand -> 19.0 Belgium -> 30.0 China -> 25.0 Spain -> 25.0 Manitoba -> 16.0 Maine -> 49.0 Hong_Kong -> 51.0 Nebraska -> 44.0 Germany -> 43.0 West_Virginia -> 55.0 Brazil -> 28.0 New_Brunswick -> 27.0 Quebec -> 34.0 Other -> 33.0 Colombia -> 33.0 Hawaii -> 28.0 Japan -> 30.0 South_Africa -> 35.0 Portugal -> 30.0 New_Mexico -> 28.0 Austria -> 49.0 India -> 34.0 Namibia -> 35.0 Argentina -> 66.0 Israel -> 31.0 Ireland -> 32.0 (123/672 instances correct)

Accuracy for single country predition:

Correctly Classified Instances 16 4.381 % Incorrectly Classified Instances 656 95.619 %

Proposed Approach Results:

Primary_Language = English && Actual_Time = Other && Race = White &&

Age = 35.0 ==> Professional

Primary_Language = English && Actual_Time = Other &&

Community_Membership_Religious > 0 && Who_Pays_for_Access_Self > 0 &&

Not_Purchasing_Security <= 0 ==> Professional

Primary_Language = English &&

Community_Membership_Religious > 0 && Community_Membership_Family > 0 ==> Other

Primary_Language = English && Actual_Time = Other &&

Community_Membership_Religious <= 0 && Race = White &&

Major_Geographical_Location = USA && Disability_Not_Impaired <= 0 && who > 90441 ==> Computer

Primary_Language = English && Race = White &&

Actual_Time = Other &&

Community_Membership_Religious <= 0 && Major_Geographical_Location = USA && Not_Purchasing_No_credit > 0 &&

Community_Membership_Hobbies <= 0 && Opinions_on_Censorship <= 3 ==> Other

Actual_Time = Other &&

Community_Membership_Religious <= 0 && Major_Geographical_Location = USA && Not_Purchasing_No_credit > 0 ==> Computer

Age = Not_Say ==> Computer

Not_Purchasing_Not_option > 0 &&

Not_Purchasing_Cant_find <= 0 ==> Computer

Age = 37.0 &&

Sexual_Preference = Heterosexual ==> Professional

Age = 37.0 ==> Management

Age = 27.0 &&

Not_Purchasing_Privacy <= 0 ==> Management

Age = 47.0 ==> Other

Age = 38.0 &&

(6)

ISSN: 2231-2803

http://www.ijcttjournal.org

Page112

Age = 30.0 &&

How_You_Heard_About_Survey_Others <= 0 ==> Management

Age = 45.0 &&

Gender = Male ==> Professional

Age = 40.0 ==> Computer

Age = 54.0 ==> Education

Primary_Language = English && Age = 24.0 &&

Community_Membership_Hobbies <= 0 ==> Professional

How_You_Heard_About_Survey_Others <= 0 ==> Professional

Primary_Language = English &&

Falsification_of_Information = Never ==> Other

Not_Purchasing_Other <= 0 && Gender = Male &&

Not_Purchasing_Bad_press <= 0 ==> Computer

Community_Membership_Other <= 0 &&

Major_Geographical_Location = USA ==> Management

Community_Membership_Other <= 0 ==> Professional

: Other

Number of Rules : 89

Time taken to build model: 1.01 seconds

Time taken to test model on training data: 0.04 seconds

=== Error on training data ===

Correctly Classified Instances 617 93.8155 % Incorrectly Classified Instances 55 6.1845 %

Performance Analysis:

Below graph shows the time comparison between existing and proposed approach.

Time(ms) 0 5 10 15 20 25 Time(ms) Time(ms) 23 11

Existing HM M Proposed Bayes

Below graph shows the Accuracy comparison between existing and proposed approach.

0 50 100

Existing HM M prediction

Proposed Bayes Based HM M Existing HM M prediction 20.3 79.69 Proposed Bayes Based HM M 75.95 24.04 Correctl y Incorrect ly

V. CONCLUSION AND FUTURE SCOPE

Because of the huge quantity of data of web pages on many portal sites, for convenience, are to assemble the web page based on category. In this paper, users’ browsing behavior will be observed at two levels to meet the nature of the webusage data. The scope of calculation is massively reduced. Next, using Bayesian theorem in the level two to predict the users’ browsing page is more effective and accurate. The results of experiment prove the Hit Ratio is well in both levels. Proposed approach give more prediction on multiple attributes compare to existing approach with less error rate.

R

EFERENCES

[1] Prediction of User’s Web-Browsing Behavior: Application of Markov Model Mamoun A. Awad and Issa Khalil, IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 42, NO. 4, AUGUST 2012.

(7)

ISSN: 2231-2803

http://www.ijcttjournal.org

Page113

[2] An Efficient Hybrid Successive Markov Model for Predicting Web User Usage Behavior using Web Usage Mining V.V.R.Maheswara Rao, International Journal of Data Engineering (IJDE).

[3]. S. Schechter, M. Krishnan, and M. Smith, Using Path Profiles to Predict HTTP Requests,” Computer Networks and ISDN Systems, vol. 30, pp. 457-467, 1998.

[4]. X. Chen and X. Zhang, “A Popularity-Based reduction Model for Web Pre fetching,” Computer, pp. 63-70, 2003.

[5]. Eugene Charniak. Statistical Language Learning. The MIT Press, 1996.

[6]. X. Chen and X. Zhang. A popularity-based prediction model for web prefetching. Computer, 36(3):63{70, March 2003.

[7]. M. Deshpande and G. Karypis. Selective markov models for predicting web page accesses. ACM Transactions on Internet Technology, 4(2):163{184, 2004.

[8]. X. Dongshan and S. Junyi. A new markov model for web access prediction. Com- puting in Science and Engineering, 4(6):34{39, November/December 2002.