Portable Extended Cache Memory to Reduce Web Traffic

(1)

Portable Extended Cache Memory to

Reduce Web Traffic

ATUL GARG

Research Scholar, MMICT & BM,

M.M. University, Mullana, Ambala, Haryana, 133207, India

DR. ANIL KAPIL

Professor, MMICT & BM,

M.M. University, Mullana, Ambala, Haryana, 133207, India

Abstract

The Web-based information systems, as the network traffic and slow remote servers can lead to long delays in the answer delivery. Client memory is largely used to cache data and minimize future interaction with the servers. In this paper, we propose an extended cache memory to store the frequently used data. We observe that the frequently used data contain the images in a Web site. In this paper we propose that all the data which is frequently used must be stored at user end in an extended cache memory. The extended cache memory may be in the form of pen drive, CD, DVD or it could be saved at user’s machine. The only difference to access the Web between traditional and in our way is that the user have to provide the path of the data which resides at the extended cache memory.

Keywords : Extended Cache Memory,NetworkTraffic, Bandwidth, Web Performance etc.

1. Introduction

The Web is becoming an important channel for critical information and the fundamental technology for information systems of the most advanced companies and organizations. Many users rely on the Web for up-to-date personal, professional and business information. The World Wide Web transform communication and browsing infrastructure to a medium for conducting personal businesses and e-commerce are making quality of Web service an increasingly critical issue [1].

Web pages are becoming more complex for clients because the inclusion of static and animated pictures, sounds, dynamically generated pages and multimedia components. This is not only increase the typical total size of the Web page, but also more resource-intensive to send and retrieve the information. The immediate effects are increased delays in accessing the documents and overloading of the Web traffic. A Web page consists of not only one HTML file, but also a collection of various types of files (e.g. HTML, images, JavaScript, Java Server Pages, Active Server Pages (ASP), cascading style sheets (CSS), Macro Media’s Shockwave). So the total time will be increased to download all the Web page’s components. In addition, the structure of Web pages has become more complex, including not only HTML files but also other components. This affects both the download time of Web pages and the requirement of network bandwidth [6].

A direct consequence of increasing number of clients is an increased load on the servers. In HTTP, this problem is partially handled by caching web pages at intermediate locations. The Ashutosh et. al. [2] investigated that, similar caching is possible for Web Services. Following are the motivations for enabling caching in Web Services [2].

(1) To decrease the load on the main server. (2) To decrease the network traffic.

(3) To decrease the response time of a web-service.

(2)

the network path to it is congested. While this argument provides the self-interested user with the motivation to exploit caches, it is worth noting that using widespread use of caches also engenders a general good. If requests are intercepted by nearby caches, then fewer go to the source server, reducing load on the server and network traffic to the benefit of all users [11].

The main aim of our research is to decrease the access page time to by introducing an external cache. In this paper we are providing a framework of extended cache mechanism to decrease the Web traffic and to improve Web performance.

2. Related Work

Web sites can be slow for many reasons, but the most prevalent one is the dynamic generation of Web documents. The Web pages also have a large number of images. Both size and number of images could affect both network and server performance, especially in peak hours when there are a lot of clients accessing the same page. Users are not willing to tolerate latency times more than eight-ten seconds. According to [6], generating a Web page in response to every request takes more time than simply fetching static HTML pages from a server. Dynamic generation of a Web page typically requires issuing one or more queries to a database, so access times to the database can easily get out of hand when the request load is high [6].

To find out how images influence the access time of the Web pages, the different tests for the sites done by Cristine et. al.[6]. In order to isolate those factors relating strictly to the composition of the Web page, the analyzed structure of different Web pages from the same site. According to the test results of [6], if the total size is 368.5 kb then the contribution of image is 98.94 %, the HTML file size is only 1.06% and the total number of images is only two. In another site total size is 331.6 kb and HTML file size is 4.70 kb, image size is 88.53 kb, other size is 6.77% and number of images are 90.

Te pages also have a large number of images. The Cristine et. al. [6] collect the various results in each result analysis as the total size increased the contribution of images and other size is increased. In this way, the result shows the influence of the network path and the performance of the server machine [6]

With the growing popularity of the World Wide Web, caching mechanisms have been proposed to rescue the Internet by reducing the page waiting time and WWW traffic. Caching can be more generally applied where a cache server can be set up to provide closer-to-home services for users who wish to reduce the page access time by selecting the cache servers as their proxy server [3]. The Hiroyuki et. al. [3] have monitored the access page times for different sites, and analyzed their composition with respect to file size and type. The experimental results by [3] show that images represent the biggest percentage of Web page size, and hence account for a considerable proportion of the download time for the page.

By [13] the very interconnectivity of the Web reduces locality of reference. That is, a user can fetch a document from a company web site in the UK and then follow a link for the same company but it is being delivered from the USA. Caching a resource close to a client is a useful way of spreading server load, as well as reducing network latency and load. It is especially applicable to the delivery of resources, e.g. video which is relatively immutable once published. By [7, 8, 9, and 12] the idea of caching data to improve performance is a subject, which has been well researched in the contexts of file-systems and database systems. It is now common practice for the Web architecture community caching.

This is in terms of adding client caches to store data that is accessed by that client frequently during a given session (per-user caching), or proxy caches which store documents from the original site on behalf of the client (per-site caching).

3. Framework for Experiment

(3)

We use a computer as a server and one computer as a client in a network. We use Apache Tomcat Web server at server side and access the pages via Internet. We use Web Performance Load Tester [14] at client machine to get the performance results. Web Performance Load Tester is a suite of testing tools for Web servers, including load testing, Web page performance analysis, and operating system tuning. The statistical analysis in the Load Testing Module takes site's performance criteria, and determines the number of users can handle. It identifies the problem regarding Web pages at URL level.

We assume two cases named Case1 and Case2. Further in both Cases use assume three sites in each. In Case1 there are three Web sites named C1T1, C1T2 and C1T3. In Case2 there are three Web sites named C2T1, C2T2 and C2T3. Each site is a different case in both Cases as shown in Table 1. These cases span a range of sizes, but all of them contain JSP [10], HTML, images, and other types of files. All the method is same in Case1 and Case2, but we double the total number of pages and also use more images in Case2.

To Access the Web sites as using traditional method under the C1T1 & C2T2 category. All the data is stored at the server and to send the results to the client the applications at the server search the data at database available at the server as shown in Figure1 (a).

To Access the Web sites under the case C1T2 & C2T2 category the client has to follow the traditional method to access the main page. The main page could not use the extended cache memory’s data available at client side. Then provide the path for related data available on extended cache memory at client side as shown in Figure1 (b). We use pen drive as extended cache. In Web applications a box is provided to the user where user can link the appropriate data. And then all the other Web pages will follow the links to display the appropriate data available from extended memory.

To Access the Web sites under the case C1T2 & C2T2 category, the client needs not to open browser. The client has to click on the main page available as an HTML file in the extended memory at the client side as shown in the Figure1 (c). The client needs not to provide the path for related data available on extended cache memory at client side.

* Traditionally the whole data is available at the Web Server.

Sr. No.

Diagram Name

Cases Number of Web

Pages used

Web Access Method

1 Figure 1 (a)

C1T1 3 Whole data available at

Server, *Traditional

Method C2T1 6

2 Figure 1(b)

C1T2 3 Images,

movies & heavy data available at Client C2T2 6

3 Figure 1(c)

C1T3 3 Front page, _Images,

movies & heavy data available at Client C2T3 6

Server

Client

Web Browser

Web Server

Database

(4)

4. Experimental Results

The performance improvement gained by introducing a cache is through satisfying requests directly from the cache instead of generating traffic to and from the server. To work effectively, the footprint of a cache should cover a large population of users. This is to increase the likelihood that two or more users will request the same resource that can then be returned at least once from the local cache. Also the cache server should be kept as local to the end-user as possible – ideally within the local area network of the organization. So, as not to flood expensive or low-bandwidth long distance links with traffic [4].

We assume that the network traffic having images, videos, first page and any other large files should priorily available at client side. The user can get the same, via, the authorized extended cache memory, provided by the organization or via the Internet. In case of Internet first time user have to follow the traditional process. And the organizations have to provide a special link to download the heavy data on user’s machine. Once the data is saved at user end, user can use this data via any memory device. He/she just have to follow the instruction to provide the main link of the heavy data on the main page of the site. Then the organizational site will link the same data with its site.

The updation may be conveyed to the user time to time. The updation may be conveyed to the user either manually or automatically by the server. Performance times are measured using Web Performance load Tester. The summery is given on below in Table2.

Server

Client

Web Browser

Web Server

Database

Extended Cache

Figure 1 (b)

Server

Client

HTML

Web Server

Database

Web Browser Extended Cache

(5)

Case1 C1T1 C1T2 C1T3

Total Duration 1:02.71 44.26 17.62

Pages 8 8 6

URLs 27 16 13

Images 19 8 6

Total size 393.3 KB 44.8 KB 37.8 KB

Average page

size 49.2 KB 5.6 KB 6.3 KB

Total image

size 353.5 KB 8.9 KB 7.4 KB

Average image

size 18.6 KB 1.1 KB 1.2 KB

Table 2

The results of each test case are different. The total time to traverse the site in each case is very different. The total time duration, total size and total image size is too much decreased using our extended cache memory. We apply the same method for each test in Case2 as in Case1. The outputs for each test of Case2 the Web Performance load Tester provides the summery as follows in Table3.

In first experiment, we measured the effect of images at server side on total duration time. In case1 each test reduces the total size, average page size, total image size and average image size.

4.1 Statistical Analysis

We compare the total Time duration, Total Web page size and total image size taken in each test case (i.e., C1T1, C1T2, C1T3, C2T1, C2T2 and C2T3). The results given by Web Performance load Tester in Table2 and Table3. The numbers 1, 2 and 3 given at X-axis represents C1T1 & C2T1. C1T2 & C2T2 and C1T3 & C2T3 respectively.

Case2

C2T1 C2T2 C2T3

Total Duration 1:09.06 55.37 41.43

Pages 12 10 9

URLs 39 24 22

Images 28 14 12

Total size 8.6 MB 80.6 KB 74.9 KB

Average page

size 730.5 KB 8.1 KB 8.3 KB

Total image

size 8.5 MB 13.3 KB 11.8 KB

Average image

size 1:09.06 55.37 41.43

Table 3

Effect on Time duration

0 20 40 60 80 1 C1T1 C2T1 2 C1T2 C2T2 3 C1T3 C2T3 Time dur atio

n in seconds

Case1 Case2

Figure 2

Effect on total Size

0 2000 4000 6000 8000 10000 To ta l size 2 C1T2 C2T2 1

C1T1 C2T1

3

C1T3 C2T3

Case1 Case2

(6)

We can easily conclude from graphs given in Figure 2, Figure 3 and Figure 4 the total time duration, total Web page size and total image size respectively. We can see the dramatically changes in total Time duration, Total Web page size and total image size.

These statistics shows that as we use the extended memory the total size keeps remain for the server and network traffic even after we increase the no of images and other attachments on the site. This will reduce the bottleneck problem at network, reduce the problem of latency and improve the performance of the site.

Conclusions

This paper has presented a novel approach for enhancing the computational performance of Web service discovery by applying the concept of extended cache memory. We also have demonstrated and implemented a variant of this mechanism, which is extended cache memory, which has better performance. The load on Web traffic is reduced and our experiments imply the performance boost that we achieved using these strategies.

Our experiments shows two cases named Case1 and Case2, each case i.e., case1 and case2 having three test. We use the Windows operating system, Internet explorer Web Browser and the Apache Tomcat as Web Servers. Result indicates that as the size of images is increased the total access time increased in the traditional method. These result in a substantial reduction in total time, total Web Size and Total Image size on the server and Network. Significantly, and counter to intuition, the large requests are only negligibly penalized or not at all penalized as a result.

References

[1] Valeria Cardellini, Emiliano Casalicchio Michele Colajanni, “A Performance Study of Distributed Architectures for the Quality of Web Services,” Proceedings of the 34th Hawaii International Conference on System Sciences – 2001, 0-7695-0981-9/01 $10.00 (c) IEEE, 2001.

[2] Ashutosh Dhekne, Sagar Bijwe, “Caching in Web Services,” New Trends in Information Technology.

[3] Hiroyuki Inoue, Kanchana Kanchanasut, Suguru Yamaguchi, “An Adaptive WWW Cache Mechanism in the AI3 Network,”

http://www.isoc.org/INET97/proceedings/A1 /A1_2.HTM#Intro.

Effect on Image Size

0 20 40 60 80 100

Total Ima

g

e Size in

Kbs

Case1 Case2 1

C1T1 C2T1 2

C1T2 C2T2

3

C1T3 C2T3

Data on Y-axis is multiply by

10.

(7)

[4] Julie A. McCann, “Adaptivity for Improving Web Streaming Application Performance,” Department of Computing, Imperial College of Science, Technology and Medicine, London, UK.

[5] Swaminathan Sivasubramanian, Guillaume Pierre, and Maarten van Steen, Gustavo Alonso, “Analysis of Caching and Replication Strategies for Web Applications,” Published by the IEEE Computer Society 1089-7801/07/$25.00 © 2007.

[6] Cristina Hava Muntean, Jennifer McManis and John Murphy, “The Influence of Web Page Images on the Performance of Web Servers,” Performance Engineering Laboratory, School of Electronic Engineering, Dublin City University, Glasnevin, Dublin 9, Ireland, Publisher Springer Berlin / Heidelberg, Volume 2093/2001, Monday, January 01, 2001.

[7] Bestavros A. & Cunha C. Server-initiated Document Dissemination for the WWW. In IEEE Data Engineering Bulletin, 19(3), 3-11, 1996.

[8] Azer Bestavros, “WWW Traffic Reduction and Load Balancing through Server-Based Caching,” IEEE Parallel & Distributed Technology: Systems & Technology , Volume 5 , Issue 1, January 1997.

[9] Breslau L., Cao P.: Fan L., Phillips G. & Shenker S. (1998). On the Implications of Zipf's Law for Web Caching, Computer Sciences Department, University of Wisconsin-Madison, Technical Report no.1371, April, 1998.

[10] Phil Hanna, “The Complete Reference JSP 2.0”, Tata McGraw Hill.

[11] David Karger 1, Alex Sherman_,1, Andy Berkheimer, Bill Bogstad, Rizwan Dhanidina, “Web caching with consistent hashing”, Published by Elsevier Science B.V., 1999.

[12] Giovanni Fulantelli, Riccardo Rizzo, Marco Arrigo and Rossella Corrao, “An Adaptive Open Hypermedia System on the Web”, Springer Berlin / Heidelberg, Volume 1892/2000, pp 305-310, January 25, 2008.

[13] Julie A. McCann, “Adaptivity for Improving Web Streaming Application Performance”, IGI Publishing Hershey, PA, USA, Pages: 172 – 191, SBN: 1-59140-034-1, Year of Publication: 2003.