WEB DELAY ANALYSIS AND REDUCTION BY USING LOAD BALANCING OF A DNS-BASED WEB SERVER CLUSTER

(1)

International Journal of Computers and Applications, Vol. 29, No. 1, 2007

WEB DELAY ANALYSIS AND REDUCTION

BY USING LOAD BALANCING OF A

DNS-BASED WEB SERVER CLUSTER

Y.W. Bai

∗

and Y.C. Wu

∗

Abstract

Based on our survey of recent articles, there is little research being conducted into quantitative analysis of the load balancing of Web server clusters. In this paper, we propose a quantitative analysis for DNS-based server clusters. We also propose a two-pass load-balancing method for determining the load balance area of these clusters. The ﬁrst pass uses the lookup table instead of a complicated computation for obtaining the load balancing of the Web service requests. The second pass also utilizes a lookup table by a precomputed Hessian matrix to obtain the load balancing. In addition, we compare the relative performance of dispatcher-based and DNS-dispatcher-based server clusters using queuing theory, analysis, and simulation, and we compare the measurement results using benchmarks. To increase the simulation performance we have designed a simulation module to promptly locate the load balancing, with a potential improvement of 36.58% over the average system response time.

Key Words

System performance, modelling techniques, measurement techniques, web servers, load balance, system response time

1. Introduction

The information traffic on the world wide web is increasing at the rate of about two or three times over per year and the average login-time on a user’s web page is much longer than it was before. “Web delay” or system “response time” is one of the main factors determining the quality of service of a web page, and networkbandwidth and con-nection strategy can also affect the total latency sources. Many means of reducing web delay have been studies, such as “caching architectures,” “prefetching,” “content distri-bution network,” transmission scheme, efficient content, image optimization, and “load balancing” [1–4].

Load-balancing techniques can be categorized into four major classes: client-based approach, dispatcher-based

ap-∗_{Department of Electronic Engineering, Fu Jen Catholic}

Univer-sity, Taipei, Taiwan, 242, R.O.C.; e-mail: [email protected], [email protected]

Recommended by Dr. Weimin Zheng (paper no. 202-1883)

proach, DNS-based approach, and server-based approach. Most of these investigations provide a new design of the ar-chitecture and the algorithm to gain performance improve-ment. However, because not much is known about the quantitative analysis of various architectures, we propose the corresponding queuing model and a prompt method to determine the load balancing of DNS-based web server clusters [4, 5].

Due to the stochastic characteristics of a web server operation and internet transmission, web delay can be very random. Therefore it can be difficult to estimate the precise delay. Our research includes finding the key factors utilized in estimating the average system response time of the DNS-based approach and using one of these load-balancing approaches to reduce web delay [2]. Based on the load-balancing mechanism of a DNS-based approach, we provide a mathematical model, to analyze and simulate results, measure the average system response time of a web server cluster, and find some of the dependent factors [1–5]. First we choose a DNS-based web server cluster be-cause it is commonly used and has not been studied in any previous quantitative analysis. For that web server cluster we have designed a queuing model to conduct a quantita-tive analysis and to find a mathematical relation to the web delay of various dispatch-factors. Based on these results, one can adjust the system operation parameters to obtain the minimum possible web delay.

Second, we use the mathematical software package QNAT (queue networkanalysis tool) [6–8] to conduct sim-ulation. In addition, we propose a mathematical model based on a queuing model to compute either the web delay or the average system response timeE(t). The mathemat-ical model will let us know the characteristics of the web delay E(t). Then we use the QNAT and TK Solver soft-ware packages to verify whether our mathematical model is correct. Furthermore, we have designed a new simulation tool to reduce the simulation time by finding the load-balancing condition of a cluster of web servers. We also use a modified version of the Hessian matrix,H[E(t)], to show how we locate the operation region of the load balancing of a cluster of web servers. Finally, we provide a precomputa-tion table to determine how to find load balancing without using excessively complicated computation.

(2)

The rest of this paper is organized as follows. In Sec-tion 2 the main factors of web delay are described. In Section 3 several load-balancing schemes are given and reasons for choosing the DNS-based architecture are pre-sented. Section 4 describes the queuing model. Section 5 describes the load-balancing scheme of a cluster of web servers on a DNS cluster architecture with a Hessian ma-trix utilizing two-pass load balancing. Section 6 provides the performance comparison between dispatcher-based and DNS-based architectures. Section 7 shows the experimen-tal results of the average system response time with load balancing using a DNS-based web server cluster. Some conclusions from our research are drawn in Section 8. 2. The Main Factors of Web Delay

2.1 Latency Sources of Systems

When we use a web browser or access a version of windows, a series of computer system and networkprocesses takes place in the background. Every process may have a random delay time of about a millisecond to a second, depending on different operational situations. For example, in a client-server system one first sends a request to the browser in the website, where the browser will service the accepted request. Most of the web contents must wait for other module responses such as software or hardware. In the best situations time is required to execute their functions no matter whether limited by the service bandwidth of the link or by device services. So “latency sources” can be the cause of most inherent delays and can make it difficult to improve the performance of common computer networks [1–3]. 2.2 DNS Lookup Table

When we go to a website we must remember the domain name, such as http://www.fju.edu.tw. In a typical com-puter networkit is necessary to transform the domain name to its speciﬁc IP address. We also ﬁnd that 20% of the domain name transformation takes about 1∼16 seconds due to retransfer time. As the number of clients increases, the DNS lookup table delay time becomes longer. So the DNS lookup is also a prime factor causing web delay [1–3]. 2.3 Connection

The next web delay builds up the TCP to route the web service to the destination based on the IP datagram. A routing algorithm provides a solution to determine how to transfer the service request to the next stage or to the wireless-level. So a router also contributes to the delay time. We further found that about 40% of the connections take about 200∼10,000 milliseconds [1–3].

2.4 Server-Side Processing

Due to the increase of websites and new services, the work-ing load of web servers increases, even for personal busi-ness aﬀairs including dynamic content. The dynamic data need to make more I/O and CPU requests to the servers.

Therefore the server frequently becomes a bottleneck, and server-side processing also aﬀects the bandwidth that a web session needs [1–3].

2.5 Document Transfer

Depending on content size, available bandwidth, setting of proxy servers and routing, document transfer will be a major factor causing delay time. For example, the transfer time of 1 MB of data will be longer than that of 1 KB of data for the same transmission rate. Therefore the document transfer time can be an uncontrollable factor of delay [2]. 3. Previously Proposed Approaches

3.1 Client-Based Approach

Client-based approach routes the request to one of the des-tination nodes on the web clusters to a specific server based on the client-site software. This approach is divided into web client and client-side proxies. The state management of the client-based approach is very difficult when network traffic increases. As the client-based approach is not often used, we will not consider using it [1–3].

3.2 DNS-Based Approach

Due to the increase of web traffic and longer URL names, we hope to make a simple virtual interface to communicate with others. This interface will make it easier for users who are not experts. A DNS-based approach just transforms the site name of the distributed nodes of the web system into an IP address. So, we must design a solution with a couple of steps as shown in Fig. 1 in order to distribute the requests of all clients to the most suitable server. Of course, many intermediate name servers can cache logical-to-IP-address mapping to reduce the amount of network traffic between a client and the cluster DNS. Therefore DNS-based TTL (time-to-live) can be used to determine how much time will be required to find an intermediate name server; otherwise the DNS cluster cannot select a suitable web server. This approach is a good idea, but the cluster DNS server often causes additional delays, as too much traffic passes through the intermediate name server [1–7].

(3)

3.3 Dispatcher-Based Approach

To centralize the scheduling request and control client-request routing completely, a dispatcher-based approach has been designed. The routing request among servers is transparent, but this is the same when dealing with addresses at the URL level on DNS. A typical dispatcher has a simple virtual IP address (IP-SVA). The dispatcher defines its own personal address based on its server and its different protocol levels based on distinct structures such as packet rewriting, packet forwarding, or HTTP redirection. The dispatcher selects an algorithm to choose a specific web server to balance incoming requests to minimize processing delay, as shown in Fig. 2 [1–3, 9–12].

Figure 2. Dispatcher-based approach architecture. 3.4 Server-Based Approach

The server-based approach uses a two-level system of dis-patching. This approach ﬁrst distributes the client requests on the web DNS to the web-server nodes. Then every server distributes the request to other system servers again. Of course, its advantage is to conduct load balancing more eﬀectively. In contrast, implements and managements are more inconvenient [1, 4].

4. Queuing Models

In real situations packets will be lost due to collision or lackof sufficient buffer. Collision is caused by more than two packets being transferred at the same time on the MAC layer. Lackof sufficient buffer is often caused by the service rate being less than the arrival rate. To observe the effect of lackof sufficient buffer we assume that the MAC layer is ideal in our paper. This means packets will only be lost by lackof sufficient buffer, not by collision. There is one intermediate name server, two dns servers, and two web servers in our architecture.

To provide a quantitative analysis of the system per-formance, we propose a corresponding queuing model for the DNS-based architecture. In this model there are two classes of arrival rates, one for the address request and the other for the document request. The domain name server provides service for the address request, and then the client sends the document request to the web server designated by the domain name server. This system is modelled as

shown in Fig. 3, and the system parameters are shown in the following.

Figure 3. DNS-based queuing model.

Due to the simulation of networkdelay, we add a queue at the feedbackpath as shown in Fig. 3 with a feedback ratio of “i” and “j” to approximate the real situation. The parameters “i” and “j” are the DNS Server 1 and DNS Server 2 packet loss ratios, respectively. For example, if i is 10% andλ₁is 100 (packets), 10 packets will be feedback to the intermediate name server from the DNS server to retransfer.

Definition of system parameters: λ1: Address request rate (job/sec)

λ2: Document request rate (job/sec)

µ0: Intermediate name server service rate (job/sec)

µ1: Cluster domain name server service rate (job/sec)

µ2: Web server service rate (job/sec)

a: Dispatch ratio, 0 < a ≤ 1 i, j: Feedbackratio, 0 < i, j ≤ 1

λ2=kλ1

E1(t): The front-end of the domain name server response

time (sec)

E2(t): The back-end of the domain name server response

time (sec)

E(t): DNS-based system response time (sec), E(t) = E(n)/λ

E(n) = ρn/(1 − ρn). The average number of tasks or jobs

on queue

To simplify the simulation, we use an inﬁnite length of the buﬀer so that there will be no blocking probability. In addition, the methods of TLL are round robin.

4.1 QNAT Simulation

Usually, there is a three-performance index for the compar-ison of simulation results. Here we use the system response timeE(t) as the comparison index for the simulation.

There are three cases: without networkdelay, with networkdelay, and with a variation feedbackand network delay. Among the three we ﬁnd a way to locate the load balancing by using the friendly queuing simulation tool QNAT [1, 6–8].

To simulate the performance we need to assign a speciﬁc set of parameters, an intermediate name server service rate of 100 (job/sec), a cluster DNS service rate if 40∼100 (job/sec), and a web server service rate of 500 (job/sec),ρ ≤ 1.

(4)

4.2 Without Network Delay

In the case of “without networkdelay” we just ignore the networkdelay. From Table 1 and Fig. 4 of the simulation results we ﬁnd that there is a load balance and a minimum system response time when the dispatch ratio a = 0.5, because the service rates of servers 1 and 2 in Fig. 4 are equal.

Table 1

Front-End System Response Time of DNS-Based Architecture (without NetworkDelay),µ₀=µ₁=µ₂= 100

(job/sec),i = j = 0.2

Arrival Rate Dispatch Ratio “a” Front-End System λ1(job/sec) Response Time

E1(t) (sec) 20 0.1 0.0324648 20 0.2 0.0317982 20 0.3 0.0313268 20 0.4 0.0310458 20 0.5 0.0309524 20 0.6 0.0310458 20 0.7 0.0313268 20 0.8 0.0317982 20 0.9 0.0324648

Figure 4. Front-end system response time of DNS-based architecture (without networkdelay), µ₀=µ₁=µ₂= 100 (job/sec),i = j = 0.2, λ₁= 20 (job/sec).

4.3 With Network Delay

Because there is a delay in networktransmission, we should add on the networkdelay to the simulation. From Fig. 5 of the simulation results we see that there is also a load balance for the dispatch ratio of 0.5, no matter what the networkdelay is in the model, as shown in Fig. 3. Fig.

6 shows the front-end system response time with variable feedbackand networkdelay.

Figure 5. The front-end system response time of DNS-based architecture (with networkdelay), µ₀=µ₁=µ₂= 100 (job/sec),i = j = 0.2.

Figure 6. The front-end system response timeE₁(t) (sec) with variable feedbackand networkdelayλ = 20 (job/sec), µ1= 100 (job/sec), µ1 increases 10 (job/sec), and i, j

reduce 0.01.

5. Mathematical Models

Based on Fig. 3 we derive E₁(t), E₂(t), and E(t) as shown in (3), (4), and (5). The derivation can be seen in [1]. The basic steps are: finding the utilization ρ of each queue, finding the average number of tasks or jobs E(n) = ρ_n/(1 − ρ_n) on each queue, and then finding the average time E(t) =E(n)/λ for a single taskto get through the queue. Hence we obtain the following equations (1) through (4), whereW_qrepresents the network delay. E1(t) = ₍₁_{− ai + aj − j)µ}1 1− λ1 + a (1− ai + aj − j)µ₁− aλ₁ + 1− a (1− ai + aj − j)µ₁− (1 − a)λ₁ + 2Wq (1)

(5)

Wq =_λ1 1 ρ2 1− ρ = ai (a − ai + aj − j)µ_∞− aiλ₁ + (1− a)j (1− ai + aj − j)µ_∞− (1 − a)jλ₁ (2) E1(t) =₍₁_{− ai + aj − j)µ}1 1− λ1 + a (1− ai + aj − j)µ₁− aλ₁ + 1− a (1− ai + aj − j)µ₁− (1 − a)λ₁ + ai (1− ai + aj − j)µ_∞− aiλ₁ + (1− a)j (1− ai + aj − j)µ_∞− (1 − a)jλ₁ (3) E2(t) =_µ 2 2− λ2 = 2 µ2− kλ1 (4) If we have the same service rate of the DNS server cluster and a very small networkdelay, then we can neglect Wq. Hence we obtain the average system response time

E(t) as follows: E(t) = 1 (1− ai + aj − j)µ₁− λ₁ + a (1− ai + aj − j)µ₁− aλ₁ + 1− a (1− ai + aj − j)µ₁− (1 − a)λ₁ + ai (1− ai + aj − j)µ_∞− aiλ₁ + 2 µ2− kλ1 (5)

5.1 Load Balancing on Cluster DNS Using Hessian Matrix

From (1) we learn thatE₁(t) is similar to the model from the previous research for the dispatcher-based architecture [1]. There is one small difference: the networkdelay. Given a specific set of parameters, we establish the fact that the networkdelay is 0.0000025 (sec) for a given specific set of parameters as shown in (6).

Hence we use the Hessian matrix [1] to form a pre-computed table instead of a complicated computation for finding the load balancing of the system. From a compari-son of the simulation results we also find a fixed difference of 0.0000025 (sec). Hence the system response time E(t) can be represented by (7). Wq =_λ1 1 ρ2 1− ρ = 0.5 ∗ 0.2 (1− 0.5 ∗ 0.2 + 0.5 ∗ 0.2 − 0.2) ∗ 100000 − 0.5 ∗ 0.2 ∗ 20 + (1− 0.5) ∗ 0.2 (1− 0.5 ∗ 0.2 + 0.5 ∗ 0.2 − 0.2) ∗ 100000 − (1 − 0.5) ∗ 0.2 ∗ 20 = 0.0000025 (6) E(t) = 1 (1− ai + aj − j)µ₁− λ₁ + a (1− ai + aj − j)µ₁− aλ₁ + 1− a (1− ai + aj − j)µ₁− (1 − a)λ₁ + ai (1− ai + aj − j)µ_∞− aiλ₁ + 0.0000025 + 2 µ2− kλ1 (7) 5.2 Two-Pass Load Balancing

DNS-based architecture needs to perform two-pass requests while providing web services. The first pass is the load balancing of address request, whereas the second is the load balancing of document request based on the distribution of address request. We find that the service rate of the front end will affect the feedbackratio and the document request rate. If the feedbackratio of the address request increases and the load balancing of the DNS side is unsatisfactory, the document request rate decreases. In other words, if the load balancing of the DNS side is sufficient, the feedback ratio is reduced and the document request rate increases. Therefore, according to the service request rate, we adjust the DNS service rate and locate the minimum system response time. The procedure is shown in Fig. 7. Here we will define the relationship between “k” and the feedback ratio “j.” To obtain a simulation example, we provide a specific set of parameters. We set the service arrival rate of the DNS cluster at 100 (job/sec) and λ₂= 22λ₁.

Figure 7. Two-pass ﬂow chart for locating the load balanc-ing of DNS-based architecture.

(6)

In addition, when the service rate decreases every 10 (job/sec), “k” decreases by 1. Hence the value of “k” is located in the interval 22∼16 because the service rate of the DNS cluster ranges from 40 to 100 (job/sec). As in the previous section, we also assume that the feedbackratio is reduced by a step size of 0.01 when the service rate is increased by a step size of 10 (job/sec). Therefore we can obtain (8) and (9) as the relationship of “k”, “j” and service rate of the DNS cluster as shown in Fig. 3.

k = 22 − (0.2 − j) × 100 (8) µ1= 40 + (j − 0.14) × 100 (9)

Because we use the same set of cluster DNS, the back-end of web servers has a set of similar parameters. Hence we can rewriteE₁(t), E₂(t), and E(t) as in (10) to (12).

E1(t) = 1 (1− j) × 100 − λ₁ + 1 (1− j)(40 + (j − 0.14) × 100) − 0.5λ₁ (10) E2(t) =_µ 2 2− (22 − (0.2 − j) × 100)λ1 (11) E(t) =₍₁_{− j) × 100 − λ}1 1 + 1 (1− j)(40 + (j − 0.14) × 100) − 0.5λ₁ + 2 µ2− (22 − (0.2 − j) × 100)λ1 (12)

According to the mathematical model and the search procedure for load balancing shown in Fig. 7, we can locate the load balancing quickly. As we examine Table 2, we ﬁnd that when µ₁=µ₂= 90 there is a minimum system response timeE(t). According to Table 2 we reduce the system response time by (0.06767926 − 0.0429222)/ 0.06767926 = 36.58%.

Table 2

System Response Time of DNS-Based Architectureλ = 20 (job/sec), µ1= 40∼100 (job/sec), µ1 increase 10 (job/sec), andi, j reduce 0.01

Dispatch Service Rate of System Response System Response System Response Ratio “a” DNS Cluster time of Front-End Time of Back-End of Time of DNS-Based

µ1(job/sec) of DNS-Based DNS-Based Architecture (E(t))

Architecture (E₁(t)) Architecture (E₂(t)) 0.5 40 0.0621237 0.00555556 0.06767926 0.5 50 0.0491827 0.00625 0.0554327 0.5 60 0.0416478 0.00714286 0.04879066 0.5 70 0.0366726 0.00833333 0.04500593 0.5 80 0.0331094 0.01 0.0431094 0.5 90 0.0304222 0.0125 0.0429222 0.5 100 0.028311 0.016667 0.044978

5.3 Load Balancing of the Whole System Using Hessian Matrix

Previously we proposed a lookup table method to locate the load balancing forE₁(t). Here we extend a single-pass method to two-pass load balancing due to the dependence ofE₁(t) and E₂(t) on each other. For both E₁(t) and E₂(t) we provide a ﬁrst-pass mechanism that uses the lookup table method to locate the load balancing and then provide the second-pass load balancing by using a precomputed table of the Hessian matrix for a document request [1, 13]. First, we take a partial derivative of (5) with respect toj and λ₁: ∂E(t) ∂j = 100 (100− 100j − λ₁)2 + 200j − 74 (−100j2+ 74j + 26 − 0.5λ₁)2 + 200λ1 (500− 2λ₁− 100jλ₁)2 (13) ∂E(t) ∂λ1 = 1 (100− 100j − λ₁)2 + 0.5 (−100j2+ 74j + 26 − 0.5λ₁)2 + 4 + 200j (500− 2λ₁− 100jλ₁)2 (14) Second, we take a partial derivative of (13) and (14) with respect toj and λ₁:

∂E(t)/∂j ∂j = 20000 (100− 100j − λ₁)3 + 60000j 2_{− 44400j + 16152 − 100λ}₁ (−100j2+ 74j + 26 − 0.5λ₁)3 + 40000λ 2 1 (500− 2λ₁− 100jλ₁)3 (15)

(7)

∂E(t)/∂j ∂λ1 = 200 (100− 100j − λ₁)3 + 200j − 74 (−100j2+ 74j + 26 − 0.5λ₁)3 + 10000 + 400λ1+ 20000jλ1 (500− 2λ₁− 100jλ₁)3 (16) ∂E(t)/∂λ1 ∂λ1 = 2 (100− 100j − λ₁)3 + 0.5 (−100j2+ 74j + 26 − 0.5λ₁)3 + 16 + 1600j + 40000j 2 (500− 2λ₁− 100jλ₁)3 (17) ∂E(t)/∂λ1 ∂j = 200 (100− 100j − λ₁)3 + 200j − 74 (−100j2+ 74j + 26 − 0.5λ₁)3 + 200 + 800λ1+ 400jλ1 (500− 2λ₁− 100jλ₁)3 (18) Third, we form a Hessian matrix:

     ∂E(t)/∂j ∂j ∂E(t)/∂j ∂λ1 ∂E(t)/∂λ1 ∂j ∂E(t)/∂λ1 ∂λ1      (19)

To locate the minimum system response time E(t) from (19) we must let the determinants of (19),|1 ∗ 1| and |2 ∗ 2|, be greater than zero. Ifj = 0.14, then:

∂E(t)/∂j ∂j = 20000 (100− 100 ∗ 0.14 − λ₁)3 + 60000∗ (0.14)2− 44400 ∗ (0.14) +16152− 100λ₁ (−100 ∗ (0.14)2+ 74∗ (0.14) + 26− 0.5λ₁)3 + 40000λ 2 1 (500− 2λ₁− 100 ∗ (0.14)λ₁)3 (20) Table 3

Two-Pass Lookup Table

Arrival Rateλ (job/sec) Service Rate of Cluster System Response Time of DNSµ₁ (job/sec) DNS-Based Architecture 29.41 ≤ λ₁< 31.25 40 Has minimum system response time 27.77 ≤ λ₁< 29.41 50 Has minimum system response time 26.31 ≤ λ₁< 27.77 60 Has minimum system response time 25≤ λ₁< 26.31 70 Has minimum system response time 23.81 ≤ λ₁< 25 80 Has minimum system response time 22.73 ≤ λ₁< 23.81 90 Has minimum system response time

∂E(t)/∂j ∂j = 200 (100− 100 ∗ (0.14) − λ₁)3 + 200∗ (0.14) − 74 (−100 ∗ (0.14)2+ 74∗ (0.14) + 26− 0.5λ₁)3 + 10000 + 400λ1+ 20000∗ (0.14)λ1 (500− 2λ₁− 100 ∗ (0.14)λ₁)3 (21) ∂E(t)/∂λ1 ∂λ1 = 2 (100− 100 ∗ 0.14 − λ₁)3 + 0.5 (−100 ∗ (0.14)2+ 74∗ (0.14) + 26− 0.5λ₁)3 + 16 + 1600∗ (0.14) + 40000 ∗ (0.14) 2 (500− 2λ₁− 100 ∗ (0.14)λ₁)3 (22) ∂E(t)/∂λ1 ∂j = 200 (100− 100 ∗ 0.14 − λ₁)3 + 200∗ 0.14 − 74 (−100 ∗ (0.14)2+ 74∗ 0.14+26−0.5λ₁)3 + 200 + 800λ1+ 400∗ 0.14λ1 (500− 2λ₁− 100 ∗ 0.14λ₁)3 (23) From the existing criteria of the minimum values of a Hessian matrix, we have:

∂E(t)/∂j ∂j ∗ ∂E(t)/∂λ1 ∂λ1 − ∂E(t)/∂j ∂λ1 ∗ ∂E(t)/∂λ1 ∂j > 0 (24) We ﬁnd the solution interval 29.41 ≤ λ₁< 31.25 from the intersection of the solutions from (20) to (24). In addition we obtain both the “j” located in the interval [0.14∼0.2] and the relationship between j, λ₁, andµ₁; then we obtain similar results from (20) to (24) as shown in Table 3.

6. Comparison of Dispatcher-Based and DNS-Based Architecture

In the simulation results with a speciﬁc set of parameters shown in Fig. 8, where the arrival rate becomes 80 (job/sec), the system response time of dispatcher-based

(8)

architecture may be very long as compared to the DNS-based architecture [1]. However, when the arrival rate is small, the system response time of dispatcher-based architecture will be very short. Furthermore, when the arrival rate becomes 63 (job/sec), the system response time of dispatcher-based and DNS-based architecture will be equal. Hence, for the small arrival rate in a LAN, we will choose dispatcher-based architecture instead of DNS-based architecture. In other words, for the large arrival rate in an Internet, we will choose DNS-based architecture instead of dispatcher-based architecture, although a DNS-based architecture has higher management cost [1].

Figure 8. A comparison of load balancing between dispatcher-based approach and DNS-based approach.

Figure 9. Testing system for the DNS-based web server cluster. Table 4

System Speciﬁcation of Web Servers

CPU RAM (Mb) O.S. Network(Mb/sec) Intermediate Name Server One P41.6 GMHZ 768 Windows 2000 Server 100 DNS1 One P41.6 GMHZ 768 Windows 2000 Server 100 DNS2 One P41.6 GMHZ 768 Windows 2000 Server 100 Web Server1 One P41.4 GMHZ 512 Windows 2000 Server 100 Web Server2 One P41.4 GMHZ 512 Windows 2000 Server 100

7. Measurement of Load Balancing Using DNS-Based Web Server Cluster

To verify the characteristic of the proposed model as shown in Fig. 3, we designed and implemented a testing system with load-balancing mechanism of a DNS-based web server cluster as shown in Fig. 9. In addition, we use the measurement tool “webserver stress tool standard edition” to measure the system response time of the web requests. Table 4 shows the system speciﬁcation [12].

First, we must set the parameters for the testing system as shown in Table 5 with and without load-balancing mechanism.

Second, by measurement, we obtain the system re-sponse time with respect to “number of times” as shown in Fig. 10, which shows the comparison between with load balancing and without load balancing setting. We also learn from the speciﬁc measurement results and (25) that we have a potential improvement of 76% with load-balancing mechanism.

Table 5

Parameter Setting of Webserver Stress Tool Number of Measured Max Data Number of Web

Users Time Rate Servers 20 40 Minute 64 kb 2

(9)

Figure 10. Comparison of the system response time be-tween with load balancing and without load balancing.

900 + 870 + 890 + 912 + 850 + 787 + 761 + 862 + 899 900 + 870 + 890 + 912 + 850 + 787 + 761 + 862 + 899 −221 + 221 + 217 + 215 + 220 + 219 + 215 + 165 + 170_{900 + 870 + 890 + 912 + 850 + 787 + 761 + 862 + 899}

= 76% (25)

Third, by measurement, we also obtain the system response time with respect to the “dispatch ratio” as shown in Fig. 11, which shows that adjustment of the dispatch ratio can aﬀect the system response time. Because servers 1 and 2 in Fig. 9 are at the same machine model they have the same service rate. Therefore, as the dispatch ratio is 0.5, which is evenly distributed as shown in Fig. 11, we ﬁnd that here we have the minimum system response time by measurement. These results of load balancing are also similar to the simulation results from QNAT shown in the previous sections.

Figure 11. System response time of a testing system of DNS-based architecture.

Fourth, by measurement, we also obtain the system response time with respect to “number of users” as shown

in Fig. 12, which shows both results, from the G/M/1 measurement curve and from the M/M/1 model curve. The increase in the number of users increases the system response time because it causes the server system to be busier, thus delaying the service requests. The G/M/1 measurement curve provides the relationship between the system response times with respect to the general arrival distribution.

Figure 12. Comparison of the system response times of M/M/1 model and G/M/1 measurement.

Based on the queuing theory, the M/M/1 model of the system response time of DNS-based architecture is shown in (26) [1]. E1(t) =₍₁_{− ai + aj − j)µ}1 1− λ1 + a (1− ai + aj − j)µ₁− aλ₁ + 1− a (1− ai + aj − j)µ₁− (1 − a)λ₁ + ai (1− ai + aj − j)µ_∞− aiλ₁ (26) Based on the queuing theory the G/M/1 model of the system response time of DNS-based architecture is shown in (27) [1]. E1(t) = 1 (1− ai + aj − j)µ₁− λ₁ + a (1− ai + aj − j)µ₁− aλ₁ + 1− a (1− ai + aj − j)µ₁− (1 − a)λ₁ + ai (1− ai + aj − j)µ_∞− aiλ₁ 1 +C2(S) 2 (27) C2₍_{S) in (27) is a variation coeﬃcient. In the M/M/1}

model C2(S) is equal to 1, whereas in our G/M/1 mea-surementC2(S) is larger than 1 as shown in Fig. 12 if the

(10)

variation of the system response time is larger than the mean of the system response time [1, 8]. The operational characteristics of the testing system make the variation coefficient shift as shown in Fig. 12. Usually the variation coefficient of the testing system is dependent on the net-worksituation, the system software management policy, and the characteristics of the measurement tool. Hence a couple of unknown factors may affect the measurement re-sults. Although the G/M/1 measurement curve is slightly far from the M/M/1 model curve, due to the large vari-ation coefficient in the G/M/1 measurement both curves still have a similar trend, as shown in Fig. 12.

8. Conclusion

In this paper we propose a two-pass load-balancing scheme. The ﬁrst pass is for the address request and the second is for a document request. Based on the corresponding queuing model for DNS-based architecture, we can analyze and simulate the system response time. Furthermore, by utilizing the precomputed table of a Hessian matrix instead of complicated computation, we can easily locate the load balancing. Finally, from comparison of the system response times between DNS-based and dispatcher-based architecture we learn that dispatcher-based architecture is good for LAN applications and DNS-based architecture is good for internet applications.

From analysis, simulation, and measurement results we learn that when load balancing exists due to a speciﬁc set of parameters and suitable adjustment, we have the potential to improve the average system response time by 36.58% with the speciﬁc set of parameters shown in Section 5.2.

Although burstiness is a special occurrence in network traﬃc, it sometimes happens in real situations. According to our survey, most papers set a burstiness parameter of arrival rate (λ) or service rate (µ) to add the burstiness factor to M/M/1 or G/M/1 [14–16]. In the future we will refer to [14–16] or others to add or modify some parameters so as to consider the burstiness factor in our model and to verify it by simulation.

References

[1] Y.W. Bai & Y.C. Wu, Web delay analysis and reduction by use of load balancing of a dispatcher-based web server cluster, IASTED Int. Conf. on Parallel and Distributed Computer and Networks, Innsbruck, Austria, 2003, 541–546.

[2] V. Cardellini, M. Colajanni, & P.S. Yu, Dynamic load balancing on web-server system, IEEE Internet Computing, 3 (3), 1999, 28–39.

[3] M. Zari, H. Saiedian, & M. Naeem, Understanding and reducing web delays, IEEE Computers, 34 (12), 2001, 30–37.

[4] H. Bryhni, A comparison of load balancing techniques for scalable web servers, IEEE Networks, 14 (4), 2000, 58–64. [5] M. Castro, M. Dwyer, & M. Rumsewicz, Load balancing and

control for distributed World Wide Web servers, Proc. 1999 IEEE Int. Conf. on Control Applications, 2, Kohala Coast-Island of Hawaii, USA, 1999, 1614–1619.

[6] S. Nadimpalli & S. Majumdar, Techniques for achieving high performance web servers, Proc. Int. Conf. on Parallel Process-ing, Toronto, 2000, 233–241.

[7] L. Cherkasova & S.R. Ponnekanti, Optimizing a “content-awareK load balancing strategy for shared web hosting ser-vice, IEEE Proc. 8th Int. Symp. on Modeling, Analysis and

Simulation of Computer and Telecommunication Systems, San Francisco, CA, 2000, 492–499.

[8] D. Manjunath, D.M. Bhaskar, H. Tahilramani, S.K. Bose, & M.N. Umesh, The queueing network analysis tool (QNAT), Proc. 8th Int. Symp. on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, 2, San Francisco, 2000, 341–347.

[9] H. Shin, S.H. Lee, & M.S. Park, Multicast-based distributed LVS (MD-LVS) for improving scalability and availability, IEEE Proc. 8th Int. Conf. on Parallel and Distributed Systems, KyongJu City, Korea, 2001, 341–754.

[10] L. Cherkasova, FLEX: Load balancing and management strat-egy for scalable web hosting service, Proc. 5th IEEE Symp. on Computers and Communications, ISCC 2000, Antibes-Juan Les Pins, France, 2000, 8–13.

[11] L. Cooper & D. Steinberg, Introduction to Methods of Opti-mization (Philadelphia: W.B. Saunders, 1970).

[12] L. Aversa & A. Bestavros, Load balancing a cluster of Web servers using distributed packet rewriting, Proc. of the IEEE Int. Performance, Computing, and Communications Conf., IPCCC ’00, Phoenix, Arizona, 2000, 24–29.

[13] Http://esd.element5.com.

[14] J. Abate, Asymptotic for steady-state tail probabilities in structured Markov queuing models, Stochastic Models, 10, 2000, 99–134.

[15] J. Rochol & M.H. Diemer, Adaptive description of ATM traﬃc ﬂows, CLEI Electronic Journal, 55 (2), 2000, 3.

[16] Y. Cheng, X. Ling, L. Cai, W. Song, W. Zhuang, X. Shen, & A.L. Garcia, Statistical multiplexing, admission region, and contention window optimization in multiclass wireless LANs, Proc. 3rd Int. Conf. on Quality of Service in Heterogeneous Wired/Wireless Networks (QShine) 2006, Waterloo, Canada, 2006, 60–65.

Biographies

Ying-Wen Bai is a professor in the Department of Electronic Engineering at Fu-Jen Catholic University. His research focuses on mobile computing and mi-crocomputer system design. He obtained his M.Sc. and Ph.D. degrees in electrical engineering from Columbia University, New York, in 1991 and 1993 respec-tively. Between 1993 and 1995 he worked at the Institute for Information Industry, Taiwan.

Yi-Chao Wu is currently work-ing toward his Ph.D. at the Grad-uate Institute of Computer and Communication Engineering, Na-tional Taipei University of Tech-nology, Taipei, Taiwan. He re-ceived his M.Sc. degree from the Department of Electronic Engi-neering, Fu-Jen Catholic Univer-sity, in 2003 and his B.Sc. de-gree from the Department of Com-puter Science and Information En-gineering, Chung Hua University, in 2001. His research focuses on analysis and improvement of load balancing on web servers, wireless ad-hoc networkand wireless sensor network.