• No results found

DYNAMIC MODELLING APPROACH FOR WEB USAGE MINING USING OPEN WEB RESOURCES

N/A
N/A
Protected

Academic year: 2020

Share "DYNAMIC MODELLING APPROACH FOR WEB USAGE MINING USING OPEN WEB RESOURCES"

Copied!
6
0
0

Loading.... (view fulltext now)

Full text

(1)

Web Mining

Usage Mining Structure Mining

Content Mining

User Access Patterns Multimedia

Mining Text

Mining

Analysis of Custom Website

DYNAMIC MODELLING APPROACH

FOR WEB USAGE MINING USING

OPEN WEB RESOURCES

B. NAVEENA DEVI1 O. SREEVANI2 1

Department of CSE, Mahatma Gandhi Institute of Technology,Hyderabad, A.P. India , [email protected]

2

Department of CSE, Mahatma Gandhi Institute of Technology,Hyderabad, A.P. India , [email protected]

Abstract :

Predicting of users browsing behavior is an important technology of E-commerce application. The prediction results can be used for personalization. Building proper web site, improving marketing strategy , promotion, product supply, getting marketing information, forecasting market trends and increasing the competitive strength of enterprises etc., In this paper proposed system used open web resources practical e-business data sets for the development of innovative e-business applications with in a short period of time to achieve dynamical and statistical analysis result to the enterprise.

Keywords: data mining, information system, open web APIs, web computing, web mining.

I. Introduction

With the increasing popularity of the internet through web access to more data and information, work, study and life style of the great changes taking place, much higher efficiency, resources of information are the greatest degree of sharing. Web mining technology and data mining is combination of web, is an integrated technology resources extracted from WWW information of the course is the implication of web resources interest, un known the potential value of the mode of extraction it repeated use of a variety of data mining algorithms from the observation data to identify patterns are a reasonable model but also to data mining technology and application of the theory of world wide web resources to carry out excavation of new research field. Researches have identified 3 broad categories of web mining: web content mining, web structure mining and web usage mining classification as shown in figure.

(2)

Web structure mining : web structure mining operates on the web’s hyperlink structure. This graph structure can provide information about pages ranking or authoritativeness and enhance search results through filtering.

Web usage mining : Web usage mining analyses result of user interaction with a web server, including web logs, click streams, and database transactions at a web site or a group of a related sites. By performing analysis on web usage log data web mining systems can discover knowledge about a systems usage characteristics and users interest. Such knowledge has various applications, such as personalization and collaboration in web based systems, marketing, website design, web site evaluation and decision support.

II. Background of Web Usage Mining

In this paper we provide an over view of research in area of web usage mining and its applications in solving e-business problems. Specially web usage mining has several applications in e-e-business including personalization and target advertising. The main area of research in this domain are pre processing and identification of useful patterns from web data using mining techniques with the help of open source software packages.

Web usage mining process can be divided into three in- dependent tasks: Preprocessing, pattern discovery and pattern analysis. The Figure shows this process. Preprocessing is first phase of Web mining process. Usage, content and structure information contained in the various available data sources are converted for next step that is pattern discovery. Pattern discovery is based on methods and algorithms developed from several areas such as data mining, machine learning and pattern recognition. This is used for understanding how users use some Website. Pattern analysis is the final step in Web usage mining process.

III. Public and Free Service Tools for Web Mininig

(3)

Google: Google provides several kinds of open web APIs for accessing their advertisement service, blog service, map service and web search service. Google search API are implemented as a web service using SOAP and WSDL standards. Google provides a java library which wraps the SOAP API. To access the Google search API service a developer needs to create a Google account and obtain a license key. The license key must be included with each query submitted.

Google maps API is a java script API which can be embedded Google maps in web pages. It is designed mainly for data representation purpose. We can draw markers and lines on the map or we can build more sophisticated applications. The maps API is available for implementation on any site that is free for consumer use.

Amazon : The Amazon e-commerce service exposes amazons product data and ecommerce functionality. It allows developers, website owners and merchants to leverage the data and functionality that Amazon uses to power its own e-commerce business. By taking advantage of this API students can empower to establish rapidly e-business solutions based on well tested resources that facilitate user participation.

eBay: The eBay platform offers an unprecedented opportunity to build a new eBay business or expand current business, reach new customers, and create or potential new stream of revenue. Leverage the resources of the eBay buyers and sellers on eBay. eBay providing services like selling and buying , searching for market places , customer service functionality, merchandising .eBay providing XML and SOAP based APIs.

PayPal: The PayPal became famous for allowing individuals and business alike to easily accept credit cards and other electronic payments for their eBay auction sales. PayPal is also accepted as a payment service by companies all over the world have established PayPal merchant accounts to allow payment for their goods and services online by customers who have a PayPal account. The PayPal API supports four services like Get transaction details, Transaction Search, Refund Transaction, and Mass pay. It has test environment called the Sand Box where we can test the working of our programs by hitting a live server. With PayPal because money is involved additional steps must be taken to ensure the communications are highly secure.

IV. Dynamic Modelling Approach for Web Usage Mining and Development of Software

Today with such an overwhelming quantity of data available on the internet, the traditional search tools problems appear. Users often suffer from information overload. They have to filter irrelevant information by themselves .The contradiction between rich data and poor knowledge caused the emergence of web data mining.

In recent years , due to the rapid development of web, more techniques related to internet and web applications have been introduced in many universities around the world to better equip students to effectively utilize the power of the web. These include basic courses such as Internet networking, Internet application development, Web search engines, and so forth. Web mining defined as the use of data mining, text mining, and information retrieval techniques to extract useful patterns and knowledge fro the web.

Most data used for mining is collected from web servers, client’s proxy servers, or server data bases, all of which generate noisy data. Because web mining is sensitive to noise, data cleaning methods are necessary. However, building a web mining application or a web services application from scratch is not an easy task that every student could complete in a semester. It is believed that using web APIs can simplify the data acquisition process and enables the students to focus on the application of web mining algorithms. The students, thus focus on the web mining algorithms to build value added applications.

V. Practicing Web Mining Using Web Resources

(4)

Technical preparation for integration of various web resources : The Main challenge of the project was to integrate open resources and design attractive system features. integrate open sources data mining components such as WEKA to analyze the data retrieved through the open web APIs. Google search API example was implemented using the java library provided. The Amazon e-commerce serviced example was implemented on the rest standard using the Jakarta http client library.

Integration process of API

Many of the Open web APIs are constructed based on web services architecture. Some of them have been wrapped into libraries written java, .net, java script which hide the detail web services protocols from the developers and make it easier for the users. Amazon provides a REST protocol for access. This is a technique for passing the data to a web service using one or more arguments as part of an URL each argument represents an XML node that client would normally pass an XML document. This technique usually XML- over – HTML document.

An example Code to start learning about the google maps API <!DOCTYPE html”-//W3C//DTD XHTML 1.0 Strict//EN” “http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd”> <html xmlns=”http://www.w3.org/1999/xhtml”> <head>

<meta http-equiv=”content-

type”content=”text/html;charset=utf8”/> <script

Src=”httpmaps.google.com/maps?file=api&amp;v=2&amp;key=abcdefg”type=”text/javascript”></script> <script type=”text/javascript/”>

Function initialize(){ If(GBrowserIsCompatible()){

Var map = new GMap2(document.getElementById (“map_canvas”));

} } </script> </head>

<body onload=”initialize()”onunload=”GUnload()”>

<div id=”map_canvas”style=”width:500px;height:300px”></div> </body>

(5)

Technical Preparation of developing e-business application using web resources : The Amazon e-commerce service example was implemented based on REST protocol. In order to illustrate the e-business application to integrate the web APIs into a specific website to perform selected business operations, depending on the system functionality. Many systems may need to store data such as Oracle, Microsoft SQL server or My SQL.

An example code for Amazon web service

This application is an Amazon science fiction book portal which features distinct visualization that help customers find their books of interest. The fiction book portal data, include book details and customer review were retrieved from the Amazon through Amazon e commerce service API and were loaded into SQL data base. The similarity between each pair of books was pre calculated according to such information as publisher, year of the publication, rating, ranking and category. An algorithm was designed to visualize similar books which are related to search query.

import java.io.IOException;

import org.apache.commons.httpclient.DefaultHttpMethodRetryHandler; import org.apache.commons.httpclient.HttpClient;

import org.apache.commons.httpclient.HttpException; import org.apache.commons.httpclient.HttpStatus;

import org.apache.commons.httpclient.methods.GetMethod; import org.apache.commons.httpclient.params.HttpMethodParams; public class HttpClientMain {

public static void main(String\[ ] args) {

HttpClient client = new HttpClient(); GetMethod method = new GetMet

hod("http://webservices.amazon.com/onca/xml?

Service=AWSECommerceService&SubscriptionId=\[yourid]&Operation=ListLookup&Li stType=WishList&ListId=1Y3TY8UXS2N6O&R

esponseGroup=ListI tems,ListInfo&Sty

le=http://www.u.arizona.edu/~chunju/text.xsl&ContentType=text/plain");

method.getParams().setParameter(HttpMethodParams.RETRY_HANDLER,new DefaultHttpMethodRetryH

andler(3, false)); try {

int statusCode = client.executeMethod(method); if (statusCode != HttpStatus.SC_OK) {

System.err.println("Method failed: " + method.getStatusLine()); //Error happens }

byte\[] responseBody = method.getResponseBody();

System.out.println(new String(responseBody)); //Print the retrieved wishlist } catch (HttpException e) {

System.err.println("Fatal protocol violation: " + e.getMessage()); } catch (IOException e) {

System.err.println("Fata

(6)

After the experienced with APIs one can design their value added business. And they could choose to include data mining algorithms and packages in their design. Finally needed to integrate web APIs into their websites to perform selected business operations, depending on system functionality.

VI. Conclusion and Future Direction

In this paper we examined uses of open web resources for developing web usage mining research. Through these mining projects we observed that mostly real time data through web resources were able to build innovative business applications with in a short period of time: ‘Which will be difficult to develop the system from scratch. In future more in-depth analysis will be conducted on the use of this new instrument in information technology for the development of e-business.

References

[1] Jose Miguel Gago, Carlos Juiz “Web Mining Service (WMS), a public and free Servie for Web Data Mining”, IEEE Fourth International Conference on Internet and Web Applications and Services, pp. 351-356, 2009.

[2] Li Lan, Rong Qiao-mei “Research of Web Mining Technology Based on XML”, IEEE International Conference on Network Security, Wireless Communications and Trusted Computing, Vol.2, pp. 653-656, 2009.

[3] Amazon. 2006. Amazon Web Services Website. http://www.amazon.com/webservices/

[4] BARESI, L., GARZOTTO, F., PAOLINI, P. 2000. From Web Sites to Web Applications: New Issues for Conceptual Modeling. In Conceptual Modeling for E-Business and the Web: ER 2000 Workshops on Conceptual Modeling Approaches for E-Business and The World Wide Web and Conceptual Modeling , Salt Lake City, Utah, USA, p. 89.

[5] CURBERA, F., DUFTLER, M., KHALAF , R., NAGY, W., MUKHI, N.,WEERAWARANA, S. 2002, Unraveling the Web services Web – An introduction to SOAP, WSDL, and UDDI. IEEE Internet Computing 6, 86-93.

[6] Barsagade, N. (2003), Web Usage Mining and Pattern Discovery: A Survey Paper, Computer Science and Engineering Dept. CSE Tech Report 8331, SMU Southern Methodist University, Dallas, TX, http://engr.smu.edu/%7Emhd/8331f04/barsagada.doc

[7] Jensen, C and Scacchi, W. (2004) Data Mining for Software Process Discovery in Open Source Software Develop ment Communities, International Workshop on Mining Software Repositories (MSR 2004), W17S Workshop - 26th International Conference on Software Engineering (2004/917), p. 96 - 100, Edinburgh, Scotland, UK, 25 May 2004, ISBN: 0 86341 432 X, www.ics.uci.edu/~wscacchi/Papers/New/ Jensen-Scacchi-MSR04.pdf

[8] Susac, D. (2003), Web Usage Mining and SQL Server 2000, ASP Today,

ftp://ftp.asptoday.com/AspToday/Articles_20020923_01_1.zip

[9] Jespersen, S.E., Thorhauge, J., Pedersen, T.B., (2002), A Hybrid Approach to Web Usage Mining, Technical Report, Department of Computer Science, Aalborg University

References

Related documents

The ISMS Description is the means by which worker safety and health requirements described in this WSHPD are integrated into mission work activities performed by NWP in

This paper presents both conceptual and theoretical framework on entrepreneurial self efficacy with a conclusion that entrepreneurial self efficacy is a good predictor

Aggressive interactions and competition for shelter between a recently introduced and an established invasive crayfish: Orconectes immunis vs... First evidence for an

In summary our study showed that dietary habits of Slovenian soldiers markedly deviated from healthy dietary habits and reflected dietary hab- its of adult Slovenian population..

due to change of incoming flood flow the resulted water levels would also change, a relationship between the storage capacity and the resulted water level were

The goal is to provide CLTs that are considering scaling up (or organizations interested in employing the CLT model) with a series of “lessons learned” from those experiences. This