Comparing Load Balancing for Server Selection
Using Cloud Services
By
Daksh Gupta
A Project Report Submitted
In
Partial Fulfillment of the
Requirements of the Degree of
Masters of Science
In
Computer Science
Supervised By
Professor Xumin Liu
Department Of Computer Science
B. Thomas Golisano College of Computing and Information Sciences
Rochester Institute Of Technology
Rochester, New York
November 11, 2013
The project “Comparing Load Balancing for Server Selection Using
Cloud Services” by Daksh Gupta has been examined and approved by
the following Examining Committee.
__________________________________
Dr. Xumin Liu
Professor
Project Committee Chair
__________________________________
Dr. Rajendra Raj
Professor
Project Reader
_________________________________
Dr. Stanislaw Radziszowski
Professor
Project Observer
Dedication
To my family for their continuous love and support; and to all my professors for their continuous guidance.
Acknowledgements
I am grateful to all the Professors who have guided me throughout my pursue for Master’s Degree in Computer Science here at RIT, but a special mention for Professor Xumin Liu for providing her guidance, and help throughout the progress
of my project work.
ABSTRACT
Comparing Load Balancing For Server Selection Using
Cloud Services
Daksh Gupta
Supervised By: Professor Xumin Liu
Load Balancing is a technique in which the requests send to the web servers are distributed in such a way that maximum resource utilization is achieved with minimum response time and servers/instances are less overloaded too. The rate at which applications at the enterprise level are expanding, it challenges the
infrastructure how to balance the load among the web servers present within the enterprise. These enterprise applications faces a roadblock of determining the least loaded and good performing servers from the pool of available servers for the request sent from the client server. So in order to remain competitive in the present scenario, enterprises are required to have effective load balancing within their environment architecture. On top of it, the main criterion of how to choose
architecture for the environment depends on the cost, reliability, and management in a short time frame. To survive in the competitive market, the wisest solution is to use cloud based service for developing infrastructure. This project will help in evaluating which load balancing algorithm should be used, using cloud based service in the industry with a thought on overall performance of
infrastructure/system.
This report focus on improving and comparing the proposed load-‐balancing
algorithm against load balancing algorithm already implemented, and determining proposed algorithm is better suited. The algorithms will be implemented and compared based on the load, response time, utilization etc. Performance will be evaluated against load, response time, utilization, bps, tps etc. Virtual instances of web servers will be created on cloud services, which will act as web servers, and web services will be developed which will access them. Load balancing algorithms will be developed, which will also be deployed on cloud, and would route the request. The advantages of proposed load balancing algorithm will be highlighted.
CONTENTS
Chapter 1 ... 9
1. Introduction ... 9
1.1 Overview ... 9
1.2 Background and Definitions ... 9
1.3 Goals and Motivation ... 12
1.4 Related Work ... 13
1.5 Hypothesis ... 14
1.6 Problem and Proposed Solution ... 14
1.6.1 Problem Statement ... 14
1.6.2 Proposed Solution ... 15
1.6.3 Road Map ... 16
Chapter 2 ... 17
2. Algorithms ... 17
2.1 Dynamic Load Balancer with Least Connections and Fastest Response Time ... 17
2.1.1 Algorithm ... 17
2.1.2 Description ... 19
2.2 Dynamic Load Balancer with Least Connections and Fastest Response Time with weight table ... 20 2.2.1 Algorithm ... 20 2.2.2 Description ... 23 Chapter 3 ... 24 3. Design ... 24 3.1 Architecture Diagram ... 24
3.2 Integration Diagram of Load Balancing Algorithms ... 26
3.3 Service Level Diag. of Load balancing Algorithm ... 27
3.4 Evaluation ... 28
3.5 Analysis ... 28
Chapter 4 ... 29
4.1 Implementation ... 29
4.1.1 Implementation Strategy ... 29
4.1.1.2 Service Detailed Implementation ... 29
4.1.2 Objective ... 30 4.1.3 SOAP Implementation ... 31 4.2 Requirements ... 31 4.2.1 Hardware ... 31 4.2.2 Software ... 31 4.2.3 Metrics ... 31
4.2.3.1 Round Trip Time ... 32
4.2.3.2 Memory Utilization ... 32 4.2.3.3 Cost ... 32 4.2.3.4 Scalability ... 32 4.2.3.5 TPS ... 32 4.2.3.6 BPS ... 32 4.2.3.7 Simple Strategy ... 33 4.2.3.8 Variance Strategy ... 33 4.2.3.9 Thread Strategy ... 33 Chapter 5 ... 34 5. Analysis ... 34
5.1 Basic Set Up ... 34
5.2 Performance Test ... 36
5.2.1 Simple Strategy ... 36
5.2.2 Thread Strategy ... 38
5.2.3 Variance Strategy (variance 0.5) ... 39
5.2.4 Testing Using HP Load Runner Tool ... 41
Chapter 6 ... 43 6. Conclusion ... 43 6.1. Current Status ... 43 6.2 Future Work ... 44 6.3 Lessons Learned ... 44 BIBLIOGRAPHY ... 45
Index for Tables
Table 1: Simple Strategy results comparison ... 36
Table 2: Thread Strategy results comparison ... 38
Table 3: Variance Strategy results comparison ... 39
Table 4: HP Load Runner Results Comparison .……… 41
Index for Figures
Figure 1: Sample SOAP Request ... 11Figure 2: Low Level Architecture ... 24
Figure 3: High Level Architecture ... 25
Figure 4: Integration Diagram of Load Balancing ... 26
Figure 5: Service Diagram of Load Balancing ... 27
Figure 6: Soap Request ... 33
Figure 7: Dynamic Load Balancer with Least Connections and Fastest Response Time WSDL ... 34
Figure 8: Dynamic Load Balancer with Least Connections and Fastest Response Time with Weight table WSDL ... 35
Figure 9: Average Time (Strategy Mode) ... 37
Figure 10: Maximum Time (Strategy Mode) ... 37
Figure 11: TPS (Strategy Mode) ... 37
Figure 12: Average Time (Thread Mode) ... 38
Figure 13: Maximum Time (Thread Mode) ... 38
Figure 14: TPS (Thread Mode) ... 39
Figure 15: Average Time (Variance Mode) ... 40
Figure 16: Maximum Times (Variance Mode) ... 40
Figure 17: TPS (Variance Mode) ... 40
Figure 18: 99%tile Response Time (HP LoadRunner) ... 41
Chapter 1
1. Introduction
1.1 Overview
Number of users using Internet is growing at an alarming rate so the need of balancing the traffic on the enterprise applications in order to provide high performance and high availability [8]. There are numbers of load balancing
algorithms present, but each and every algorithm face some problem or other. The main goal of load balancing is to achieve route the requests among the web servers with a minimum response time. And another issue that enterprise faces is the high cost, managing of web servers built where load balancer can be deployed. Cloud based infrastructure/service can be used as one of solutions where web servers can be built as it has various advantages like cost effective, easily managed, virtual instances, and is scalable too. There are various load balancing techniques available in cloud-‐based services like Predictive load balancing, random biased, join idle queue etc. [12].
This project will help in evaluating which load balancing algorithm should be used on top of using cloud based service in the industry with a thought on overall performance of infrastructure/system. This report focus on improving and
comparing the proposed load balancing algorithm against load balancing algorithm [1] already implemented, and determining proposed algorithm is more well suited. This report describes comparing load balancing techniques on the cloud services, which will act as web servers by using SOAP requests on web services as data set. Web services [13] will be implemented which will each point to one of the load balancing technique, and load will be increased on it. The main objective of load balancing algorithm will be based on distributing or route the data depending upon the closest or best performing web servers. The algorithms will be implemented and compared based on the load, response time, utilization etc. Performance will be evaluated against load, response time, utilization etc. [17, 18]. Virtual instances of web servers will be created on cloud services, which will act as web servers, and web services will be developed which will access them. Load balancing algorithms will be developed, which will also be deployed on cloud, and would route the request. The advantages of proposed load balancing algorithm will be highlighted.
1.2 Background and Definitions
Number of users using Internet is growing at an alarming rate so the need of balancing the traffic on the enterprise applications is critical to provide higher tps, lower response time, and less number of outliers [5]. This is being achieved through
load balancing, which means to route the requests among web servers. There are number of load balancing algorithms present, but each of them has some issues or disadvantages related to it.
Standard definitions:
Load Balancing:
It means balancing the increased client requests on the distributed instance deployed on the server so as to avoid increased delay in response time due to the increased load. It is mainly important for the high level applications where number of request sent by the user can range from thousands to millions i.e. it’s very difficult to predict the number of requests. They need to have many web servers for load balancing. They can be categorized under either “static algorithms or dynamic algorithms”. “Round robin, random etc.” are categorized under the “static algorithms”, whereas “least connection, observed, etc. are categorized under dynamic algorithms. “[22]
Random:
The random scheduling, requests received from the client are being assigned to any random server from a given list. As the requests are being assigned randomly, there is no mechanism of sharing the load, which leads to overloaded servers. This
algorithm leads to underutilization and overutilization of servers. [22]
Round Robin:
The round robin algorithm routes the new request to the instance or server,
followed in queue. It distributes the requests evenly among all the instances so that load is shared evenly. This algorithm is better than the random as requests are diving evenly among the servers. But the main disadvantage of this algorithm is it works better only of all the servers have same configurations. [22]
Weighted Round Robin:
This algorithm is an enhancement of “Round robin” where each instance gets assigned with a load depending on processing capability, which is, determined on how that instance is behaving. It removes the deficiency of round robin, but it does not consider the processing time each server is taking in responding the response. [22] To remove, these deficiencies, dynamic load balancing algorithms were used. [3]
Dynamic Load Balancing:
Dynamic load balancing monitors the state of the server continuously and assigns request to the server, which has less load. There are various dynamic load-‐balancing algorithm like least connection, fastest, adaptive load balancing etc.
Fastest Load Balancing:
This method routes the request from the client to that server which has fastest response time [6]. The disadvantage or shortcoming of this algorithm is it’s not
possible for every server to respond the response in seconds, which will lead to congestion in the network.
Least Connection Load Balancing:
This method derives the meaning from its name only. The server passes a new request from the client to that web server which has very least connection at that point of time [3, 22]. This technique works best where all the applications running web server have same infrastructure. The advantage of this algorithm leads to disadvantage of it. If there are two applications having different infrastructure i.e. HTML application and other uses J2EE or xml, it will lead to bottleneck of
connections, as all connections will require have different round trip time as its dependent on the server from where the request is originated. [22]
Observed Load Balancing:
This technique makes use of fastest, least connection. Web servers are categorized based on least and response time [3, 22]. The server with less requests and higher throughput time gets the new request. But the disadvantage faced by it is it does not have any weights associated the server, due to which the server starts getting
overloaded with requests.
Web Service:
Web service is a medium that permits communication between the applications independent of the platform as well as the language used to program it. It consists of functions, which are accessed by other applications using XML. This interface mechanism means SOA (Service Orientation Architecture). [23]
SOAP:
The Simple Object Access Protocol (SOAP) provides a transport layer between the applications interacting with the web service using soap wsdl in distributed network. SOAP message consists of XML, which consist of the function that application will be requesting from the web service. The web service end point and port is being provided to application for it to communicate with web service. SOAP xml is generated through the WSDL provided by the web service. WSDL consists of functions that web-‐service provides, and can be accessed through the network. [17]
The above figure consists of sample soap request that application uses to interact with web service.
JAX-‐WS:
It’s an API used in Java, which help in developing the web services, it’s a segment of Java EE. It’s an open source project. [16]
Cloud Computing:
Cloud computing is a service provided through the Internet. This service helps the enterprise to obtain software resources like apache, WAS, java, etc. as well as hardware resources i.e. getting firewall set up for the server etc. These days many enterprises or industries are going towards cloud computing, which is not only cheap, but also scalable. One more advantage of using these services is industry are developing an attitude of paying whenever the service is being requested rather than paying a lump amount for everything.
Amazon Web Services (AWS)
It’s a pack of web services provided by “Amazon.com” so that they can be used for cloud computing. They have many advantages like scalable, reliable, cost effective etc. [24]
Amazon Elastic Cloud Compute
It is one of the web services, which comes as a part of AWS. It’s a web service that helps in increasing the capacity on the server or helps in creating lot of virtual servers, which can be used by clients. [24] Billing is done on the basis of how much the virtual servers are used.
1.3 Goals and Motivation
Different enterprises have different requirements, depending upon their needs and budgets. Some enterprises would like to achieve high performance from and
availability of their systems without any concerns about the cost incurred, while smaller enterprises with limited budgets would like to get the most out of their systems. Number of users using Internet is growing at an alarming rate so the need of balancing the traffic on the enterprise applications in order to provide high performance and high availability. There are numbers of load balancing algorithms present, but each and every algorithm face some problem or other. The main goal of load balancing is to achieve route the requests among the web servers with a
minimum response time. This report focus on improving and comparing the proposed load-‐balancing algorithm against load balancing algorithm [1] already implemented on cloud servers, and determining proposed algorithm is better suited. Web services will be implemented which will each point to one of the load balancing technique, and load will be increased on it. The load balancer algorithms will be developed as web service so that the client can use it as an end point. The
web service will act as global traffic manager (GTM), which will be called by the client, and algorithm will act as local traffic manager (LTM) routing the requests to the appropriate cloud servers to get the result back to client in minimum time. The algorithms will be implemented and compared based on the load, response time, utilization etc. Performance will be evaluated against load, response time, utilization etc. Virtual instances of web servers will be created on cloud services, which will act as web servers, and web services will be developed which will access them. Load balancing algorithms will be developed, which will also be deployed on cloud, and would route the request.
1.4 Related Work
DONAR [2] (Decentralized Server Selection) was being developed as a distributed system that provides make sure to resolve the name conflicts and also choosing the appropriate server for the client request which has the less load. This algorithm was being developed to direct incoming request from the client to appropriate web servers in order to balance the load between the web servers. The way this
algorithm works is mapping node gets the request from the client, which then sends the request to server. Mapping node has an optimizer, and is decentralized, which listens to other nodes, and collects information, and then sends the request from the client to unoccupied or least used resources/server.
High Level architectures applications [1] face performance issues due to the load not getting distributed evenly among the web servers. One of the solutions designed for it was use of dynamic load balancing [1]. This paper designs an architecture where the distributed algorithm uses a load balancer which observe the distributed load centrally, and use various distributed load reallocation policies [6] which reduces the imbalance of load as the load is distributed in a distributed manner thereby it removes the single point of failure, delays, and bottlenecks [1].
This paper [3] discusses the use of adaptive load balancing in an environment. It tries to explain various load balancing algorithms. Round robin algorithm tries to route the requests equally among the instances/instances, it does not care how many connections instance already has previously. Load Connection table is used to check whether server is loaded, normal, under loaded or idle. This table helps in determining to which server the request should be routed too. It also takes into account response time that each request make, in order to find server is behaving the best, for the client to route the request too. [3]
“Weighted Average Load Balancing technique” presented a load balancing technique using preemptive scheduling by Qin [4]. Whenever a new request is received from the client, the algorithm has to do one of the following tasks. It either has to add to the present queue or interrupting current task in execution. Whenever job gets assigned to a node, it checks whether the node has become over loaded or not. If the
node gets overloaded, algorithm tries to find a new node so that it transfers the most useful jobs from the overloaded node.
Presently there are various techniques of load balancing [6] which falls under the category of either “inter web servers” (there is only one web servers, and algorithms for load balancing are architected to balance load within that) or intra web servers (i.e. it consists of more than one web servers, load balancing algorithms are
designed to distribute load among them). This paper uses an algorithm whose aim was to reduce the imbalance in the web server. The way algorithm (VectorDot) [6] used to work is it used to find nodes in the web servers where imbalance has occurred, and then try to shift the load to available free nodes. Imbalance is found out when the threshold on particular node has increased its limit.
1.5 Hypothesis
There are many papers focused on how to balance the load on high-‐level
applications using dynamic load balancing. My aim is to compare two load-‐balancing algorithms on compared based on the load, response time, utilization etc.
Performance will be evaluated against load, response time, utilization, bps, tps etc. I will describe the use of cloud services, Web Services, XML and improved algorithm, which will try to find server, which has less load. I will be working through my research with the initial approach of implementing the algorithms. In next phase, I will be comparing them against the parameters defined in the metrics, and in the last phase will increase the load on cloud by creating a lot of virtual instances or by using load technique in SOAP UI.
1.6 Problem and Proposed Solution
1.6.1 Problem Statement
Different enterprises have different requirements, depending upon their needs and budgets. Some enterprises would like to achieve high performance from and
availability of their systems without any concerns about the cost incurred, while smaller enterprises with limited budgets would like to get the most out of their systems. Number of users using Internet is growing at an alarming rate so the need of balancing the traffic on the enterprise applications in order to provide high performance and high availability. There are numbers of load balancing algorithms present, but each and every algorithm face some problem or other. The main goal of load balancing is to achieve route the requests among the web servers with
minimum response time. This report focus on improving and comparing the proposed load-‐balancing algorithm against load balancing algorithm [1] already implemented on cloud servers, and determining proposed algorithm is better suited.
1.6.2 Proposed Solution
The main goal of load balancing is to achieve route the requests among the web servers with minimum response time. This report describes comparing load balancing techniques on the cloud services. I will be using Web services, XML and cloud to design and implement, and then will provide a comparison metrics between the algorithms in terms of query response time, bps, tps, CPU percentage etc. in against the load.
The following steps will be followed in order to achieve this process:
Step1:
JAX-‐WS web service will be developed which will return response requested by client. SOAP UI tool will behave as the client for the web service, where we will send soap requests and receiving soap responses.
Step2:
Load balancing algorithm will be developed using Java. There are various types of Load balancing algorithms existing, but I will be using the Dynamic Load balancer [1] and will try to remove its shortcomings with by adding additional logic by introducing server weights logic to it.
• Dynamic Load Balancer With Least Connections and Fastest Response Time: This paper [1] describes the use of dynamic load balancing. The paper describes the importance of load balancing in system-‐oriented architecture. The algorithm makes use of combining the logic of least connections present, and fastest response time. Combining them helps in reducing the load.
Monitoring agents are used to look at the current activities, and load. They try to classify the instances as over-‐loaded, balanced and under loaded. Depending upon the instance nature, that instance is being called, which returns the results. As the connections keeps on increasing, this algorithm suffers from in longer run as either new instances need to be created or all the instances are almost on the same side of balanced, or over loaded so it the efficiency reduces.
• Dynamic Load Balancer with Least Connections and Fastest Response Time With Weight table: This algorithm integrates the concept of dynamic load balancing [1] with weight table assigned to each instance [4]. Each instance has enhanced logic of having a dynamic number assigned for maximum connections defined in the weight table. Whenever the connections limit reaches the threshold, it resets, and starts from fresh.
Step3:
Amazon Web Services will be used to create EC2 instances. EC2 instances will be created where the load balancer developed in step2 and the web service will be
deployed in step1. Load balancer algorithm will determine which EC2 instance of web service should be called for sending the request, and getting the response.
Step 4:
Soap UI tool will be used to increase the load on the EC2 instances, and will be using soap load UI tool to compare the results from the two-‐load balancer depending on request/response sent. The results will be compared on the basis of performance, bps, tps, round trip time, and CPU utilization. [14]. The dynamics of creating instances, algorithm, and running tasks will be a challenge.
1.6.3 Road Map
To understand the project, the project report is distributed in different sections. Chapter 1 gave a summary about introduction, background, related work, problem statement, and proposed solution. Chapter 2 will provide insight on how the algorithms work, and description. Chapter 3 and 4 will provide description about design, approach, and implementation of project. Chapter 5 describes the
experimental results, and with Chapter 6 discuss about the current status, future, and conclusion.
Chapter 2
2. Algorithms
Load Balancing:
It means how to fine-‐tune the traffic among the distributed application so as to avoid the increase in response time when the load has increased on the server. It is mainly important for the high level applications where number of request sent by the user can range from thousands to millions i.e. it’s very difficult to predict the number of requests. They need to have many web servers for load balancing. They can be categorized under either static algorithms or dynamic algorithms. Round robin, random etc. are categorized under the static algorithms, whereas least connection, observed, etc. are categorized under dynamic algorithms. [3, 22]
2.1 Dynamic Load Balancer with Least Connections
and Fastest Response Time
2.1.1 Algorithm
LOADBALANCER (request) Input: request Output: responseDeclare ec2 Instance [][], url,
url ← selectBestServerInstance(request, ec2Instances) soapReplyAnswer ← getReply (url, request)
return soapReplyAnswer
addConnection (connection,ec2Instance)
Input: connection[][], ec2Instance[] Output: connection[][]
i ← 0 j ← 0
for ec2 ← 1 to ec2Instance.length do j ← 0 connection[i, j] ← ec2[i] j ← j+1 connection[i, j] ← 0 i ← i +1 return connection selectBestServer(request, ec2Instances)
Input: request,ec2Instances [][] Output: ec2Instance
Declare connection [][], responseTime[][],
connection ← addConnection(request, connection,ec2Instance) Declare leastConnectionURL, leastResponseTimeURL,
secondMinConnections
i ← 0 j ← 1
for connection ← 1 to connection.length do if connection[i, j] = 0
then
connection[i, j] ← 1 return connection[i, j] else
i ← i +1
Sort the ec2instances of connection in increasing order of connections leastConnectionURL ← connection[0,0]
Sort the ec2instances of responseTime in increasing order of average response time
leastResponseTimeURL ← response[0,0] i ← 0
if leastConnectionURL = leastResponseTimeURL then
return leastConnectionURL else
for connection ← 1 to connection.length
do if connection[i,0] = leastResponseTimeURL then
numberOfConnections ← connections[i,1 else
i ← i +1
secondMinimumConnections ← connection[1][1]
if numberOfConnections = secondMinimumConnections then
return numberOfConnections else
return responseTimeURL
getReply (url, request)
Input: url, request Output: reply startTime ← 0, i ← 0, j ← 0 endTime ← 0 averageTime ← 0, responseTime ← 0 startTime ← getCurrentTime soapReply ← answerFromWebService(url,request)
endTime ← getCurrentTime
responseTime ← endTime – startTime for response ← 1 to response.length do averageTime += response[i,1] i ← i +1 responseTime[url][1] ← averageTime return soapReply ColumnComparator() Input:
Output: sorted Array declare columnToSort;
ColumnComparator(int columnToSort) this.columnToSort ← columnToSort compare(Object o1, Object o2) String[] row1 ← (String[]) o1; String[] row2 ← (String[]) o2;
return row1[columnToSort].compareTo(row2[columnToSort]);
2.1.2 Description
This balancer tries to find the minimum loaded server based on the connections and response time. After finding the server, it sends that request to it. The algorithm tries to find server by looking at the number of connections and response time. There are many conditions checked in the algorithm to find out which balancer is performing best at that moment. Monitoring agents are used to look at the
connections each server has and response time associated with it and after performing the logistics, best server instance is being sent the request. [1]
2.2 Dynamic Load Balancer with Least Connections
and Fastest Response Time with Weight table
2.2.1 Algorithm
LOADBALANCER (request) Input: request Output: response
Declare ec2 Instance [][], url,
url ← selectBestServerInstance(connection, numberArray, responseTime)
soapReplyAnswer ← getReply (url, request) return soapReplyAnswer
addConnection (connection,ec2Instance)
Input: connection[][], ec2Instance[] Output: connection[][]
i ← 0 j ← 0
for ec2 ← 1 to ec2Instance.length do j ← 0 connection[i, j] ← ec2[i] j ← j+1 connection[i, j] ← 0 i ← i +1 return connection removeConnection (connection) Input: connection[][] Output: connection[][] connection ← null
connection ← new Connection return connection
addRandomNmbrToServer (numberArray,ec2instance)
Input: numberArray[][], ec2Instance[] Output: numberArray[][]
i ← 0 j ← 0
for ec2 ← 1 to ec2Instance.length do j ← 0
randomInt ← randomGenerator.nextInt(10) numberArray[i,j] ← ec2Instance[i]