Exploiting Private and Hybrid Clouds for Compute Intensive Web Applications

(1)

Exploiting Private and Hybrid Clouds for Compute

Intensive Web Applications

Aleksandar Draganov

August 17, 2011

MSc in High Performance Computing The University of Edinburgh

Year of Presentation: 2011

(2)

Abstract

Cloud computing changes the way software and hardware are purchased and used.

Increasing number of applications are becoming web based since these are available from anywhere and from any device. Such applications are using the infrastructures of large scale data centres and can be provisioned efficiently. Hardware, on the other side, representing basic computing resources, can be also delivered to match specific demands without the consumer having to actually own them.

Cloud computing model provides benefits for private enterprise environments where a significant physical infrastructure already exists. Private cloud management platforms have been emerging in the last several years providing new opportunities for efficient management of internal infrastructures leading to high utilization.

The resources in a private cloud may become insufficient to satisfy the demands for higher computational power or storage capacities. In cases where the local provisioning is not sufficient, the private resources can be extended with resources from other public remote infrastructures. The resulting infrastructure appears as one hybrid entity. In other words, a private cloud can be extended to a hybrid cloud by adding resources to their capacity from public cloud providers where required.

This work investigates the usage of an open source cloud management platform (OpenNebula [1]) to create private cloud and using it for hosting compute intensive web application by managing a farm of virtual web servers to meet its demands. The benefits of using such an approach along with the issues it raises are explained. The chosen algorithm for the web application (LU factorisation) represents a generalized case of applications where the complexity is O(N³) and the input data size is N². An approach where the computational load is increased is also tested. The results and the DQDO\VLV RI WKH ZHE DSSOLFDWLRQV¶ SHUIRUPDQFH DUH EDVHG RQ WKH QXPEHU RI VHUYHG

requests per second (i.e. throughput). Two scenarios are covered ± utilizing a small existing private infrastructure (i.e. a private cloud) and extending the private infrastructure with external resources (i.e. a hybrid cloud) on Amazon Web Services [2].

OpenNebula proved to be easy to use and robust software. Its capabilities ensured convenient management of virtual web server farm within private and hybrid clouds.

The network bandwidths appeared to be the most significant limiting factor for the effective use of the farm influenced by the heterogeneous character of the setup, the virtualization of the network interfaces, their sharing between virtual machines and the size of the input data. However, increasing the execution time (i.e. heavier problems) proved to lessen the impact of those issues.

(3)

List of Tables

Table 2.1: EC2 instance types. ... 9 Table 3.1: Theoretical IP addresses organization for private cloud ... 18 Table 5.1 Standard load application throughputs with LB on VM. Private cloud server farm. ... 43 Table 5.2 Standard load application throughputs with LB on physical machine. Private cloud server farm. ... 44 Table 5.3 Increased load application throughputs with LB on VM. Hybrid cloud server farm ... 44 Table 5.4 Increased load application throughputs with LB on physical machine. Hybrid cloud server farm ... 45

(8)

List of Figures

Figure 2.1: OpenNebula main components. Reproduced from [1]. ... 6

Figure 2.2: Horizontally scaled web servers. ... 11

Figure 3.1: Standard OpenNebula cloud organization ... 13

Figure 3.2: The actual OpenNebula cloud organization used ... 14

Figure 3.3: Private cloud web server farm ... 16

Figure 3.4: Standard private cloud network configuration ... 17

Figure 3.5: The actual private cloud network configuration ... 19

Figure 3.6: Hybrid cloud web server farm. Reproduced from [54]. ... 20

Figure 3.7: Performance comparison between EC2 instances and local VM. Single executions. ... 21

Figure 3.8: Performance comparison between EC2 instances and local VM. Thread pool with size 10. ... 21

Figure 3.9: Size of data transferred in relation to problem size ... 22

Figure 3.10: Throughput generated by 1 web server for N = 50, 150, 250 for different number of running threads ... 25

Figure 4.1: Throughput generated by private cloud web server farm for different problem sizes with LB deployed on VM ... 28

Figure 4.2: Network bandwidths for private cloud web server farm with VM LB... 29

Figure 4.3: Throughput speedup for private cloud web server farm with VM LB ... 29

Figure 4.4: Network bandwidths for private cloud web server farm with LB on a physical machine ... 30

(9)

Figure 4.5: Throughput generated by private cloud web server farm for different

problem sizes with LB deployed on a physical machine ... 31

Figure 4.6: Throughput speedup for private cloud web server farm with LB deployed on a physical machine ... 31

Figure 4.7: Throughput generated by private cloud web server farm for different problem sizes with LB deployed as VM for increased load application ... 32

Figure 4.8: Throughput speedup for private cloud web server farm with VM LB for increased load application ... 33

Figure 4.9: Throughput generated by private cloud web server farm for different problem sizes with LB deployed on a physical machine for increased load application ... 33

Figure 4.10: Throughput speedup for private cloud web server farm with LB deployed on a physical machine for increased load application ... 34

Figure 4.11: Network bandwidths for hybrid cloud web server farm with LB on a physical machine ... 35

Figure 4.12: Throughput generated by hybrid cloud web server farm for different problem sizes with LB deployed as VM for increased load application ... 36

Figure 4.13: Throughput speedup for hybrid cloud web server farm with LB deployed on a physical machine for increased load application ... 36

)LJXUH0RQLWRULQJ2SHQ1HEXOD¶V90VWKURXJKLWVFRPPDQGOLQHWRROV ... 38

)LJXUH0RQLWRULQJ2SHQ1HEXOD¶V90VWKURXJK9LUWXDO0DFKLQH0DQDJHU ... 39

)LJXUH0RQLWRULQJ2SHQ1HEXOD¶VKRVWVWKURXJK2SHQ1HEXOD6XQVWRQH ... 39

)LJXUH0RQLWRULQJ2SHQ1HEXOD¶VKRVWVWKURXJK9LUWXDO0DFKLQH0DQDJHU ... 39

(10)

Abbreviations

AMI Amazon Machine Image

AWS Amazon Web Services

CLI Command Line Interface

CRM Customer Relationship Management

DNS Domain Name System

EBS (Amazon) Elastic Block Storage

EC2 (Amazon) Elastic Compute Cloud

ECU (Amazon) Elastic Compute Unit

ERP Enterprise Resource Planning

HTTP Hypertext Transfer Protocol

IaaS Infrastructure as a Service

JVM Java Virtual Machine

KVM Kernel-based Virtual Machine

LB Load Balancer

MAC Media Access Control

NAT Network Address Translation

NIC Network Interface Card

OS Operating System

PaaS Platform as a Service

QoS Quality of Service

RDS (Amazon) Relational Database Service

RR Round -Robin

S3 (Amazon) Simple Storage

SaaS Software as a Service

SLA Service Level Agreement

SQS (Amazon) Simple Queue Service

SSL Secure Sockets Layer

VM Virtual Machine

VMM Virtual Machine Manager

(11)

Acknowledgements

I would like to thank my supervisor, Dr Charaka J. Palansuriya, for the time spent and his tireless and patient guidance and support during the project. I am also very grateful to Mr. Maciej Olchowik for all his helpful advices and suggestions. I would as well like to acknowledge the efforts of Mr. Craig Morris and Ms. Fiona Bisset from EPCC support to provide me with the required for the project hardware resources and help.

Finally I would like to thank my family and my friends for their infinite support throughout the last 12 months.

(12)

Chapter 1 Introduction

There is an ever increasing movement towards adoption of cloud computing within IT departments of academic and business environments. Based on the advances of the virtualization technologies developed so far, the concept of the Cloud offers a convenient way of efficient organization and usage of different computing resources.

Many cloud services have been introduced in the recent years ± some of them available DV SXEOLF VHUYLFHV SXEOLF FORXGV DQG RWKHUV GHSOR\HG ZLWKLQ RUJDQL]DWLRQV¶ SULYDWH

networks (private clouds).

Public cloud services are available in three types of service models: Infrastructure as a Service (IaaS), Platform as a Service (PaaS) and Software as a Service (SaaS) [3]. IaaS provides control over running Virtual Machines (VMs), networks and any other basic computing resources. PaaS offers a platform where users can deploy their applications if they use specific tools and technologies of the PaaS provider. SaaS is normally a web application that is running on the top of a cloud infrastructure. The consumer only pays for what he uses ± an hour of CPU, a gigabyte of memory for the first two service- models and a monthly fee per person for SaaS. The cloud providers often offer different fee schemes so they can match most of the XVHU¶V requirements.

Private clouds are deployed within the organization¶V QHWZRUN and managed by its system administrators. They require more work to set up and to maintain, but provide better flexibility and can be used for different purposes ± virtual clusters, hosting web applications or deploying VMs used as desktops. Private clouds can be extended to hybrid clouds by adding resources to their capacity from public providers. There are several open source platforms that can be used for managing private clouds.

One of the main characteristics of a cloud platform is its elasticity. From a XVHU¶V

perspective the cloud seems to have unlimited resources that can be provisioned just for the time period required to accomplish a task and then can be released with minimum effort when they are no longer required. This feature allows quick scaling and better utilization and provisioning of the resources.

This project focuses on the installation process and usage of an open source cloud platform for private cloud management for creating and managing a private cloud for hosting web applications that require significant computational time. An important aspect of this work is to investigate how the private cloud can be extended to a hybrid

(13)

cloud when the internal resources are no longer adequate. The focus is on web applications since this is an ever more popular way of providing software and thus making it available for PCs, tablets, smartphones and any other device that can access it.

The project has 4 main objectives:

x To show that a private cloud platforms provide benefits for hosting compute intensive web applications;

x To show that public clouds can be used seamlessly for extending the private resources thus transforming a private cloud into hybrid (i.e. cloud bursting);

x To identify issues that can arise in usage of private and hybrid clouds for web applications;

x To investigate the possible usage of existing resources for cloud infrastructure.

The OpenNebula [1] software will be utilized for creating a small private cloud that will be used for performing various experiments with a simple web application. The steps of the deployment process will be discussed along with the major issues arisen during the setup. Some of the important capabilities OpenNebula provides will be described. One of these features is the logic that offers control over VMs within Amazon Web Services [2]. The cloud has to be configured to use those external resources in addition to the local setup and increase its capacity in this way. The web application itself will be deployed on running VMs.

The project does not aim to outline the advantages and disadvantages of any specific web technology or programming language or platform, but some performance aspects of the chosen setup will be discussed.

(14)

Chapter 2 Background

2.1 Cloud computing

Cloud computing model has become an important concept in the last few years. Many different major companies and independent individuals have given definitions for Cloud and cloud computing according to their strategies, visions and products [4], but most of them focus on several important characteristics ± resources are provided rapidly, on demand, they are highly scalable, and in case of public services the consumer pays only for what he uses. The resources might be computational power, storages, networks, applications. Therefore IT resources are now elastic - they can be provisioned for the exact duration they are necessary and released when the consumer no longer needs them.

As described in [4] some definitions include other aspects as Internet-driven service, monitoring and (self-) management, Service Level Agreement (SLA) management, controllable through API, guaranteed Quality of Service (QoS). But those are mostly specific for some platforms or services.

As mentioned earlier there are three main models of cloud computing:

Software as a Service

In general SaaS is a normal web site IURPXVHU¶VSHUVSHFWLYHEXWWKHFRQVXPHUVRIWKH

service often pay a fee to gain access to specific features of the system for a fixed period of time, usually per week, month or year. SaaS software is running in a cloud environment hence it should be able to serve a number of users together. Popular areas where SaaS software is available are accounting, project management, ERP, CRM, document management and many others [5].

Platform as a Service

The consumers of the service can deploy their applications on top of the cloud platform supported by the provider. The applications must be developed using APIs and programming languages made available by the PaaS provider. Therefore software development is restricted to a specific platform and limited set of technologies. Since

(15)

every PaaS provider has his own special features and mechanisms, the developers are creating software intended to run only on a particular platform (i.e., vendor lock-in).

The most popular platforms at the moment are Google App Engine [6] and Microsoft Windows Azure Platform [7].

Infrastructure as a Service

The provider of the service delivers basic resources ± computational power, networks, storages, operating systems. Those resources are scalable and elastic. They can be provisioned on demand and made accessible for a minimum time. The consumers of the service have full control over the running virtual machines (VM) hence being able to choose all the elements of the software stack that their applications require. But the providers cannot guarantee that the physical resources such as VMs or storage will not fail and they do not provide back up for the data. That is a responsibility of the user.

Other research [4] even differentiates some other services such as Computing, Storage, and Database that usually fit into the models mentioned above, but they are offered as separate products by some cloud providers.

2.2 Main benefits of cloud computing

One of the main features that differentiate cloud computing from the traditional computing is that clouds are usually programmable. With PaaS the user is supplied with API that allows starting, stopping and other operations over workers and storages [8]

[9]. In IaaS the API can be used to run, stop, migrate, clone, delete, etc VMs [8] [11]

[12]. In such a context the cloud can indeed be called self-manageable, but it only provides API that can be used to achieve this. Private cloud management platforms also offer API for the same purposes as IaaS.

A cloud system can run a number of VMs within a physical machine in a complete isolation from each other. As a result applications can be easily provisioned only with the resources they require. In case of under provisioning another VM can be started in order to handle part of the load generated towards the application and later on destroyed if not necessary so the resources can be used for other purposes or applications. Better utilization is an advantage that comes from the virtualization technologies, but cloud platforms provide mechanisms to control many physical nodes from a centralized point and thus organizing the nodes into server farms improving the process of their management.

)URPDXVHU¶VSHUVSHFWLYHWKH,DD6VHUYLFHRQO\SURYLGHV90VVWRUDJHVHWF$OOWKH

other details as virtualization software, hosts OS, hardware organization, etc are hidden to the consumer of the service providing a new level of abstraction.

Undoubtedly better utilization and server consolidation have financial benefits when considering the existing infrastructure that many organizations have. But public IaaS providers ensure that an organization can rent significant external resources instead of purchasing its own. Small- or medium-sized business can considerably decrease their

(16)

expenses with such an approach. No capital investments for building an infrastructure, moderate ongoing costs, provisioning new computational power requires minimum effort, no staff dedicated entirely to maintain the hardware setup and high availability (eg., Amazon Web Services) are the main befits for organizations that use IaaS services. Detailed cost comparison between internal IT, managed services, and IaaS approaches is published in chapter 1 in [13].

For bigger organizations that can invest in data centre infrastructure and have the expertise to build and maintain it (e.g., a university) it might be more advantageous to have the resources internally. That is to have a private cloud.

2.3 Virtualization role

The increasing popularity of the cloud computing approach is driven by the advancement of the virtualization technology. It allows different logical machines to share the same hardware but to run isolated from each other. As a result physical hosts can be utilized better and the computing resources can be allocated easily in a very flexible manner. These isolated running operating systems (OS) are called virtual machines (VMs).

A number of virtualization software solutions are available. They usually include a hypervisor also called virtual machine manager (VMM). The hypervisor assign resources to the VMs and lets them operate as if they were running on different machines independently. It runs on the host hardware directly (bare-metal) or as a part of the OS by modifying the kernel or by using the system services. Based on this and on the way the VMs communicate with the physical devices the virtualization is either full virtualization or para-virtualization [14]. Recently the hardware vendors added a new virtualization feature (Intel VT-x [15], AMD-V [16]) which allows guest OS to directly interact with the VMM layer [14].

Popular virtualization hypervisors are VMware vSphere [14], XEN [18], KVM [19], and Microsoft Hyper-V [20]. The last one cannot be used on Linux machines. Red Hat has cut the support for XEN in their new RHEL 6 and has replaced it with KVM.

Therefore the support for XEN for Scientific Linux which is the OS used for the current project is also being dropped. As KVM packages for the latest Scientific Linux 6 are prebuilt and ready to install that makes it a suitable choice for the scenarios in the current dissertation.

Virtualization also applies to other physical resources as storage and networks.

Though virtualization software abstracts the hardware and provides flexible and agile way to control it, it is not a cloud itself. A layer that controls it has to be deployed so it can access the entire infrastructure within a data centre and to manage all the resources.

This layer is called cloud management platform.

(17)

2.4 Choice of cloud management system

The choice of an IaaS platform to be used for this project presents a challenge since software in this field is relatively new and new versions are released in short cycles. For example, OpenNebula has three beta releases between July 2010 and July 2011.

The other widely used cloud platform is called Eucalyptus [21]. Eucalyptus and OpenNebula share their most important features:

x Both projects had first versions in 2008 and are equally mature nowadays;

x Both can be installed and run on the majority of the Linux distributions available;

x Both can use XEN, KVM, and VMware virtualization;

x Both are EC2 compatible;

x Both can be configured to cloudburst using resources into Amazon Web Services (AWS);

x Both have user management, etc.

In [22] different cloud management systems are compared, but they have all been changed dramatically after the publication of the article. For example, both OpenNebula and Eucalyptus can exploit EC2 resources and there is Graphical User Interface (GUI) available now for OpenNebula. 4caaSt project cloud analysis [23]

provides up to date detailed descriptions of existing platforms including Open Stack [24] which is still very new and not as mature as OpenNebula and Eucalyptus.

Eucalyptus is dual licensed and some of the features are only available in the commercial version as VMWare support and Windows guest OS. Creators of OpenNebula maintain a blog [25] where practical experience is shared between all the users. OpenNebula is also proved to work effectively in large scale environments [26].

Based on its proved capabilities and big and active community OpenNebula was chosen for the experimental work in this project.

2.5 OpenNebula

OpenNebula is a fully open-source toolkit to build IaaS Private, Public and Hybrid Clouds [1].

Figure 2.1: OpenNebula main components. Reproduced from [1].

(18)

The platform consists of two running processes on a front-end or a central machine as shown on Figure 2.1. The first process is called OpenNebula daemon and it is responsible for controlling the cloud platform modules and the virtual machines. The other process called scheduler decides where to place each virtual machine by accessing the database OpenNebula holds on the front-end. The information stored in the database includes the available resources on the nodes based on the requirements for previously submitted VMs so the scheduler does not need to contact the hypervisors.

For each host added to the list of hosts 3 drivers have to be specified so that the daemon can use it. The virtualization driver (VMM driver) is used for communication with the hypervisor installed on the node and to control the VMs running on it. The daemon uses the Transfer Manager (TM) driver to perform operations with the storage system on the host in order to control the VM images. The last driver called the Information Manager (IM) driver allows the daemon to monitor the hosts. Different drivers can be specified and used with different nodes.

Working with OpenNebula requires describing images, virtual networks and VMs in template files that can be later submitted to the daemon. Management of the platform is usually performed through the command line, but there is also a GUI that includes the majority of the functions and replaces the template files ± OpenNebula Sunstone.

OpenNebula works with several basic objects through the command line:

x User ± represents a user that has access to the system. The command used is oneuser.

x Host ± represents a host/node that is controlled by the front-end. Group of drivers have to be specified for each host. The command used is onehost.

x Image ± represents an OS image. Its basic parameters are name of the image, whether it is public and available to all users or not, short description and path to the image. Once submitted OpenNebula copies the image to a new location and changes its name. Every image is described with a template file. The command used is oneimage.

x Virtual network ± represents a network shared by the VMs. They are either ranged or fixed. For the range networks the only required parameters are size (B, C), network address and a bridge. OpenNebula assigns IPs to the VM automatically. The fixed networks are defined with a set of IPs, possibly MAC addresses and a bridge. The number of addresses specified is the maximum number of VMs that can use it. Networks are also described with a template file. The command is onevnet.

x Virtual machine ± represent a guest OS or a VM. Also described with template file. The basic parameters are CPU, memory, image and network. OpenNebula automatically assigns IP (if available) to the instance and places it on a host of its choice. Every VM may be in several states. The most significant of them are pending (waits for the scheduler to place it somewhere), running, stopped, failed. The command is onevm.

(19)

The users list on the OpenNebula home page [1] is often updated. The most significant companies and projects that control internal resources with it are CERN, China Mobile, ESA, KPMG, Fermilab, Telefonica, Reservoir, StratusLab, BonFIRE, etc.

The creators of OpenNebula have suggested two ways [27] to increase the internal infrastructure¶VFRPSXWLQJFDSDELOLWLHV with Amazon EC2 VMs ± for virtual cluster and web server farm. For both of their scenarios they use VPN channel in order to make the remote nodes a part of the private network. The network latencies caused by the Internet connection and especially by the VPN channel affect the performance, but the results clearly show improvement of the throughput when using EC2 instances.

However for the web server benchmarking they have only performed tests for accessing static files.

2.6 Amazon Web Services (A WS)

Amazon has described several use cases [28] covering many IT services as application hosting, backup and storage, databases, e-commerce, HPC, web hosting, search engines, on-demand workforce. For each case they offer a group of products that can be exploited for deployment of a high quality service.

The major products and services available are:

x Elastic Compute Cloud (EC2) - provides compute capacity;

x Simple Storage (S3) - a scalable storage;

x Elastic Block Storage (EBS) ± represents block level storage that is designed to be utilized with EC2;

x SimpleDB ± non-relational database;

x Relational Database Service (RDS) ± highly scalable RDBMS in the cloud;

x Simple Queue Service (SQS) ± highly scalable queue;

x CloudWatch ± provides more detailed monitoring for the resources being used;

x CloudFront ± used for distributed delivery in order to make content easily accessible with low latencies and fast data transfers.

The services exploited in the current project are EBS and EC2. EBS is only used to store Amazon Machine Image (AMI) so its part is insignificant. The AMI is used as a prototype of a VM. Amazon provides many images that are preconfigured and publicly available for common needs. Once the image is chosen, a key to access the running VM is provided, security group that acts as a firewall has to be specified, and the type of instance must be selected for the image. The fee paid for a running VM depends of the instance.

(20)

Instance type Memory ECUs I/O Perf. Storage Platform $ per hour

Micro 613 MB Up to 2 Low EBS only 32/64 bit 0.035

Small 1.7 GB 1 Moderate 160 GB 32 bit 0.095

Large 7.5 GB 4 High 850 GB 64 bit 0.38

Extra Large 15 GB 8 High 1690 64 bit 0.76

High-Memory

Extra Large 17.1 GB 6.5 Moderate 420 GB 64 bit 0.57

High-Memory

Double Extra Large 34.2 GB 13 High 850 GB 64 bit 1.14

High-Memory Quadruple Extra Large

68.4 GB 26 High 1690 GB 64 bit 2.28

High-CPU Medium 1.7 GB 5 Moderate 350 GB 32 bit 0.19

High-CPU Extra

Large 7 GB 20 High 1690 GB 64 bit 0.76

Cluster Compute Quadruple Extra Large

23 GB 33.5 Very High 1690 GB 64 bit 1.60*

Cluster GPU Quadruple Extra Large

22 GB

33.5, 2 x NVIDIA 7HVOD³)HUPL´

M2050 GPUs

Very High 1690 GB 64 bit 2.10*

Table 2.1: E C2 instance types.

All instance types available with EC2 service are shown on Table 2.1. Obviously there are configurations for any requirements including cluster with 10 Gigabit network and GPU cluster hence there are resources which are suitable even for HPC applications.

The prices shown are for Linux usage for the data centre in Ireland, however clusters are only offered in the main data centre in the US. A free tier of services is offered for the newly registered users for their first months only including micro instances. One EC2 Compute Unit (ECU) provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor [2].

EC2 services are organized in different data centres called regions. They isolate the regions from each other to achieve greater fault tolerance, improve stability, and to help prevent issues within one region from affecting another [29].

A performance comparison between the basic instance types for Geosciences application is described in [30]. They have achieved good CPU utilization for small, medium, and large instances, but the problem does not efficiently make use of instances that are bigger.

Though all their products are advertised as highly available, flexible, and scalable they do not fit into every problem. As shown in [31] both the financial model and the performance of S3 are not suitable for big scale scientific projects since they have different requirements for data usage. However benchmarking AWS and any other

(21)

cloud services in general is a problem specific therefore every research involving different application will provide new results.

AWS Management Console available on the AWS website provides convenient way to control EC2 instances and the other services they offer. The web interface uses the EC2 API Tools [32] which are also available to the users so they can control the services remotely. These tools provide all the operations as a command line utility. They include many operations such as registering, deploying and terminating an instance, listing all the running instances, all the regions, security group operations, etc. Installing the tools is straight-forward ± they only require a certificate and a private key so the user can authorize himself remotely. Several environment variables have to be set up to specify the key, the certificate and the URL of the region that will be used.

2.7 Web applications

The majority of web sites or web services are deployed within physical or virtual servers. If the capacity that the underlying hardware provided is not enough and could not reasonably handle peaks the response time will be increased and the throughput of the architecture might be unacceptably low. ,IWKHKRVWLQJKDUGZDUH¶VSHUIRUPDQFHLV

good enough to serve the peaks then it is probably over-provisioning when the demand is not high. The elastic dynamic provisioning cloud computing services offer can be exploited for handling unexpected loads and to keep the provisioning close to the exact requirements of the application by adding virtual machines representing workers. Later on those added workers can be destroyed with minimum effort.

The most notable example for exploiting public cloud computing¶V self-provisioning mechanisms is Animoto [33] [34] [35] [36]. This web application allows its users to generate movies with their pictures and music uploaded from their computer or from other web location. The templates for the movies and a small playlist are available on the Animoto servers so some of the data that the algorithms use is not uploaded by the users. One movie generation takes several minutes of processing which makes the application very suitable for web server farm where each request is processed independently. Using EC2, S3 and SQS Animoto had automatically scaled in just a few days from several hundred instances to thousand instances [34].

Different compute intensive applications that can make effective use of the cloud computing are problems that do not require big input data or the data is already on the server as data mining, map browsing, etc. However they produce other challenge for the developers - distributing huge volumes of data.

2.8 Load balancing

Different load balancing techniques are discussed and compared in [37] in the context of public cloud services. The hardware-based Load Balancer (LB) is very scalable and very efficient and robust, but it is usually not offered in public cloud environments and it is typically expensive. Domain Name System (DNS) load balancing is another

(22)

option, but it also has serious disadvantages since the DNS servers cache information for exact time and therefore prevents the elasticity. Layer 2 optimizations approaches are disabled in Amazon EC2 because of security issues. Software load balancing is not very scalable because of CPU and network bandwidth limitations. To overcome these constraints they suggest a client-side load balancing scheme which indeed seems to be very scalable, but it is targeting AWS. The client accesses a page stored in S3 which is empty, but includes Java script that holds the logic for the load balancing. The script H[HFXWHG RQ WKH FOLHQW¶V PDFKLQH GHFLGHV ZKLFK VHUYHU WR DVN Ior the content of the page.

However such a client-side mechanism will not fit into the private cloud platform because the web servers are expected to have private addresses which are not accessible directly from the clients. Layer 2 optimizations will not be possible for cloud bursting with Amazon. DNS cannot deal with short peaks and hardware-based load balancers are still expensive.

For compute intensive applications the CPU limitation for the software LB might not affect the performance, but the network bandwidth will most probably be an issue for a range of problems. However it is the easiest to deploy and control load balancing mechanism within the private cloud and to allow scaling out to public services.

Figure 2.2: Horizontally scaled web servers.

(23)

The classic way to scale web servers is horizontal scaling [36] ± shown on Figure 2.2.

The users request content from a domain or IP address that is associated with the load balancer (LB). The LB re-distributes the incoming requests to the workers using a specific algorithm. The web servers can access shared content as database or file system concurrently if necessary. The shared resources also might be distributed. When a worker sends response to the client the traffic goes again through the LB.

2.9 Benchmarking web servers

There are existing tools designed to benchmark web servers. One such a benchmark is SPECweb2005 [38] [39] which includes 3 different tests:

x Banking ± simulates load that user would generate to a possible banking system. Aims at testing SSL connections and is therefore the most CPU intensive of all.

x Ecommerce ± simulates e-commerce website where a mixture of HTTP and SSL connections occur.

x Support ± only focuses on HTTP connections and simulates a support website where the users just browse but also download big files, up to 40MB.

The benchmark comes with ready-mae PHP or JSP applications and the tests are against them. However the current project aims at compute-intensive applications that do not fit in any of the cases covered by SPECweb2005.

Another powerful tool is httperf [40] [41]. It is designed to generate and sustain server overload [40]. The tool provides many options and it is intended to be universal.

However its main goal is also benchmarking the components of system rather than the application itself. All the generated results are grounded on received responses, but in the current application benchmarking it will be important if all the calculations have been successful and no errors have occurred during the execution.

Therefore a different approach for benchmarking the entire setup with the application has to be considered.

(24)

Chapter 3 Web server farm design

In order to build a small private cloud two physical machines with equivalent configurations were deployed in the University of Edinburgh gigabit network. Each one has 4 cores Intel CPU and 8GB of memory. The first machine (i.e. Node 1 in the setup) is used as a front-end (where the OpenNebula software runs) for the cloud and also as a node or host for VMs. The other one (Node 2) is only hosting VMs.

3.1 Cloud software stack

OpenNebula requires a number of other tools and technologies in order to be able to start and stop VMs by using the virtualization hypervisor, Kernel-based Virtual Machine (KVM).

Figure 3.1: Standard OpenNebula cloud organization

The standard cluster-like style of organizing the resources used by the cloud platform is shown on Figure 3.1. Normally the OpenNebula daemon and scheduler run on a front- end separately from all the workers. However the hardware available for the project is

(25)

limited and dedicating a whole machine for the front-end is not appropriate. Instead the front-end and one of the hosts are in fact the same machine as shown on Figure 3.2.

The OpenNebula software along with the virtualization software run together on node 1.

Figure 3.2: The actual OpenNebula cloud organization used 3.1.1 Intel V T-X

This hardware-assisted Intel Virtualization Technology combined with software-based virtualization solutions provides maximum system utilization [15]. It represents an extension that allows issuing of specific privileged instructions for x86 architectures from the guest OSs through the hypervisor. The hardware virtualization support is required by the KVM kernel modules in order to perform full virtualization. It is available on many Intel processors [42].

3.1.2 Scientific Linux 6 (Host OS)

It is the main Linux distribution used within EPCC. It has support for KVM and all the virtualization software is prebuilt as packages and available.

3.1.3 K V M kernel modules

The Kernel-based Virtual Machine (KVM) software includes two loadable kernel modules ± common KVM module and chip-specific one (kvm-intel and kvm-amd).

They extend the OS kernel functionality that is not available within the OS core thus transforming it into hypervisor. When the modules are loaded the hypervisor can host Linux and Windows machines. Each machine has private virtualized hardware: a network interface card (NIC), disk, graphics adapter, etc [19].

3.1.4 Libvirt

Libvirt [43] is an API or a toolkit that provides control over hypervisors running on a modern Linux system. The main goal of the project is to provide a common and stable

(26)

layer sufficient to securely manage domains on a node [43], where domain is OS running on a VM. Libvirt can be also managed remotely. In the current setup OpenNebula controls the KVM hypervisor by interacting with the libvirt API on each node separately.

Libvirt also installs a private bridge on the host so that the VMs running on it can be easily connected in a small private Network Address Translation (NAT) based network.

3.1.5 OpenNebula

OpenNebula calls libvirt functions for all the VM operations such as creating, destroying, shutting down, migrating, etc. The OpenNebula daemon running in the front-end must be able to connect to the nodes through a password-less ssh. In the official installation instructions [44] it has been suggested that all the nodes use only one private key copied to the shared location on the front-end. However in the current setup it is more secure to have individual private keys for each machine and to only allow password-less connections from the front-end and not to expose private keys in a public network. For a big number of nodes it is easier to have only one key.

7KH 2SHQ1HEXOD DGPLQLVWUDWRU¶V KRPH IROGHU on the front-end is exported on the network, mounted on each node and made as a home folder there as well. In this way all the hosts share the same ssh settings and have access to a centralized storage for all the running images.

The latest stable version at the time the practical work of the project started was OpenNebula 2.2 so it the one installed.

3.2 Web server software stack 3.2.1 Java

Java [45] as a language and platform is widely used for web applications and at the same time it is often used for scientific applications and therefore there are many existing libraries as the one chosen for the project ± JLapack [46]. Though Java Virtual Machine (JVM) might have some limitations especially with its heap memory size, its performance is still sufficient for the current setup.

Furthermore there are a number of Java web containers to host web applications.

Tomcat [47] is an open-source widely used and robust Java Servlet container that powers many big-scale applications.

3.2.2 Load balancing/proxy server

As discussed in 2.8 software load-balancer is not the most scalable solution so an efficient server that proxies the incoming traffic is a necessity. Nginx [48] [49] is stable, secure and high-performing web server and proxy server. It can be easily configured

(27)

and its ability to handle requests asynchronously makes it very efficient in terms of memory and CPU usage [49]. In the current setup nginx acts as a load balancer.

3.3 Private cloud web server farm setup

Figure 3.3: Private cloud web server farm

The private cloud server farm shown on Figure 3.3 hosts the web application. Clients send requests to the load balancer which only proxies them to the workers. The LB and the web servers are running VMs controlled entirely by OpenNebula ± deployed, monitored and later deleted. The OpenNebula scheduler decides where to put each worker. However the LB has to be placed on a machine that is connected to a public network. Currently both the nodes are connected. Other option, in case no node can be accessed from the outside, is configuring a port on the front-end that forwards the traffic to the proxy. The web servers have to be visible to the LB.

(28)

3.3.1 Networking

Figure 3.4: Standard private cloud network configuration

The ideal configuration for a cloud would normally include a dedicated private network especially for the usage of the cloud and therefore the administrator will have complete control over the network, illustrated on Figure 3.4. Then, for example, several or dual/quad-port NIC cards can be added on each node and connected to the network switch so the VMs do not share the same physical Ethernet interface. The front-end is the only machine connected to the public network, but it also has access to the private network for the cloud in order to access each host. IP addresses should be assigned to the front-end, nodes and VMs. Example for IPs organization is listed in Table 3.1.

In such a configuration a major problem will be where the LB is deployed. Several different approaches exist. The LB might be installed on the front-end as a VM or on the physical machine directly. It can be also deployed on any node and a port from the front-end re-directed to the internal IP or an additional Ethernet interface can be added to one of the hosts so the LB VM can use it and connect directly to the public network.

(29)

Machine Address

Network 192.168.0.0

Front-end 192.168.0.2

Node 1 192.168.0.3

Node N 192.168.0.10

VM 1 192.168.0.11

VM 2 192.168.0.12

VM Y 192.168.0.100

Table 3.1: Theoretical IP addresses organization for private cloud

VM can only use networking effectively if the NIC card is virtualized which is the major performance bottleneck if there is an intense data transfer in or out of a VM.

Connecting a VM to a local or public network and making it accessible for others is achieved through Ethernet bridging [50] [51] [44]. For each physical host a virtual interface representing a bridge has to be added and logically attached to the physical network interface. Thereby the VMs running on the node will be able to access the local network and the other machines will be also able to see them. However the traffic goes through the bridge which is shared between all the VMs and the host OS.

There is also another approach available for VMs networking. Each machine needs to have access to the local network through User Networking [50] and thus gaining access to a VPN server. As a consequence all the VMs along with any other machine that is able to contact the server can be organized together in a private network with OpenVPN [52], for example. However this approach is not efficient and the encrypted VPN connection represents additional overhead to the network performance. It should be used if the limitations caused by the decreased network bandwidth are not critical.

A full control over the network was impossible for the current project and due to EPCC policies the machines for the cloud were deployed within a public network. In this way the hosts can be directly accessed from Internet and they are not controlled by the LCFG [53] system used by the EPCC support.

Having the nodes for the cloud in a public network causes several issues. First the IPs cannot be assigned and used freely because they are public. Second the Gigabit network the machines are connected to is used by other people so its performance can vary in time. Furthermore the LB receives the incoming traffic and just a moment later it re- distributes the requests to the workers using the same network which means sharing the same bandwidth.

The actual private cloud web server farm used in this project includes LB and 7 web servers organized by 4 on each node, as shown in Figure 3.5. The LB and three of the workers are deployed on node 2 and configured to use private NAT-based network that libvirt provides (bridge virbr0) in order to avoid usage of public IPs. The LB has another virtual network interface that connects it to the public network through the

(30)

public bridge (br0) and thus being accessible from Internet. The web servers deployed on the front-end however require public IPs in order to receive requests by the LB.

Figure 3.5: The actual private cloud network configuration

Though connected in different ways the network bandwidth between the LB and workers running on node 1 and node 2 is similar: 110 - 130Mbits/sec (see B.3 and B.4).

This is the maximum outgoing traffic trough a bridge with the bridge-utils package provided by the OS which means there is almost 10x times overhead in this direction.

Obviously because of the limitations, a specific mapping of the VMs over the hosts has to be done. However, in a standard configuration the only requirement might be for the LB location.

3.4 Hybrid cloud web server farm setup

OpenNebula provides functionality to control EC2 instances through the EC2 API tools. Instances can be easily launched and terminated from the private cloud in the same way the internal VMs are controlled - thus making the cloud hybrid. However OpenNebula does not fully use the EC2 command line tools. Furthermore, OpenNebula does not provide monitoring for the running EC2 instances.

The current setup shown on Figure 3.6 represents the standard configuration suggested in [54]. The running EC2 instances are accessible remotely and from the LB in particular so it can use them as workers and re-send requests to them in the same way it does for the local workers. However, the data sent across Internet might be read from 3^rd parties which could be a problem if the application is serving internally for the organization needs. If that is the case then setting up a VPN should be considered, but as discussed in 3.3.1 the encrypted connections introduce additional overhead. In the current setup it is accepted that the data in not sensitive.

(31)

The maximum network bandwidth from the LB to any EC2 instance running is also limited by the bridge ± 110 - 130MBits, but in this case it can vary since it is dependent of the Internet connection.

Figure 3.6: Hybrid cloud web server farm. Reproduced from [54].

3.4.1 V M performance comparison

The main issue when involving VMs outside of the private infrastructure to cooperate with local resources is the fact that the external provider probably does not offer instances that exactly match the performance of the internal machines in terms of CPU, memory and network bandwidth. Consequently the first action to undertake when using the hybrid cloud is to establish which of the EC2 machines shown in Table 2.1 has similar productivity to a local web server. A significant difference in the performance of the internal and the external machines might present additional load imbalance.

Micro instances included in the free tier were not tried since they are intended for less demanding applications, but may still consume significant compute cycles periodically and allow bursting CPU capacity when additional cycles are available [55]. Though significantly cheaper than the other instances they are not predictable in terms of CPU and only provide 613MB of memory which is not enough for the application.

The machines that look similar to the local workers are the small instance and the High- CPU medium instance. In order to compare them with the local worker the application was run on each of them and the average timings from 20 executions are shown on Figure 3.7.

(32)

Figure 3.7: Performance comparison between E C2 instances and local V M . Single executions.

The small instance is far from the productivity of the local worker. The medium instance, though also slower, has 2 logical CPUs and when using a thread pool to execute 10 tasks concurrently (Figure 3.8) outperforms the local web server, but the difference is not dramatic so for the purposes of the hybrid cloud the High-CPU medium instance will be used.

The results on Figure 3.8 are also average timings obtained from 20 executions.

Figure 3.8: Performance comparison between E C2 instances and local V M.

Thread pool with size 10.

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000

50 75 100 125 150 175 200 225

Execution time, ms

Problem size

Time on local VM m1.small c1.medium

0 5 10 15 20 25 30 35 40 45

50 75 100 125 150 175 200 225

Execution timings, s

Problem size

local VM m1.small c1.medium

(33)

3.5 System of linear equations solver 3.5.1 The algorithm

The chosen algorithm consists of 2 separate operations - LU factorisation and partial pivoting. Their complexities are respectively O(N³) and O(N²). The data increases in a N²fashion [56]. Even though it is unusually efficient algorithm [56] the increased amount of context switching caused by the virtualization should affect the performance.

Though the problem is not particularly suitable for web it is simple enough to deploy and test, and represents compute load similar to applications as image processing algorithms used in all image hosting websites, mathematical services as [57], etc. So WKLVDSSOLFDWLRQ¶VSHUIRUPDQFHVKRXOGEHcomparable for all O(N³) problems where the input data increases with N².

Such kind of applications is different not only because they require more time for processing, but also because they need a much bigger input data than the standard web applications. Figure 3.9 shows the increase of the data sent to the web server with the increase of the size of the input data. It includes the 2-dimensional N² array with the parameters and 1-dimensional N array representing the Right Hand Side (RHS).

Figure 3.9: Size of data transferred in relation to problem size 3.5.2 The web application

The web application consists of a simple Java Servlet [58] which accepts POST HTTP data parameters sent from the clients in the same way files are sent. The input data is organized into arrays and passed to the JLapack methods that solve the system. After the calculations have finished the Servlet sends response to the client.

0 200 400 600 800 1000 1200

50 75 100 125 150 175 200 225 250

Problem size

Data sizes, kbytes

(34)

The application is built [59] and the generated war file is deployed into the Tomcat container.

3.6 Running the cloud 3.6.1 Preparations

Using the commands OpenNebula provides, several actions were undertaken:

x 3 hosts were added and configured so OpenNebula can access them, deploy and manage VMs. These hosts are node 1 (i.e. front-end), node 2, and a host that represents EC2 services.

x 2 networks were added ± a private and a public one. The private network is used for the NAT-based communications that take place on node 2 including the LB and workers. The public network will connect the LB and the web servers on node 1.

x 1 image was added so the VMs can use it. It is configured to start the application on boot. The same image is used for the LB where manual adjustments will be required for the list of the web servers and reloading the configuration of the proxy. The image has inside scripts provided by OpenNebula for configuring the network interfaces based on their Media Access Control (MAC) addresses so that each VM can start with networking.

x Several templates were written so they can be used to start VMs both locally and in EC2. Templates are simple text files that OpenNebula uses for describing images, networks, VMs.

Using AWS Management Console a few additional steps were completed:

x An Amazon Instance Image (AMI) was prepared with the functionality similar to that of the local image and stored in EBS so it can be used to start new instances. Since EBS is paid a snapshot of running instance can be made and downloaded locally, but because of the limited time duration of the hybrid cloud experiments and for simplicity the AMI was stored on EBS.

x The default security group was altered so it can accept connections from the required ports.

3.6.2 Resources allocation

Each host has 8GB of memory with 0.5GB of that reserved for the host itself. The LB is configured to use 1.5GB and the workers 1.87GB each so all the workers have the same resources available.

Each host has 4 cores CPU so the most natural configuration is 1 core per VM to avoid some of them using the same core. Unfortunately, at least one of the workers will have to share core with the host.

(35)

A user might expect from cloud management platform to actually place VMs to work over a particular core of the underlying hardware. However the only reason OpenNebula requires CPU parameter seems to be only to check if the sum of the CPU power used by all the VMs is less than what is available. For example, if the host has 2 cores and there are 2 VMs running each submitted for 0.7 CPU, the system will not allow another one with 0.7, but a VM that requires 0.6 will make it through. In other words, OpenNebula does not map CPU cores with running VMs. However, KVM represents running VMs as Linux processes which allows the OS scheduler to schedule them [60], but they can be also manually controlled with virsh [61]. According to the statistics the top command returns, the physical CPU utilization during the experiments was close to the maximum which means that the virtualization software and the OS perform reasonable mapping of the physical resources and VMs so the cloud platform does not need to do it explicitly. However cloud management systems might benefit from having a feature that allows more precise control VMs and CPU mapping.

3.7 Running the application 3.7.1 Load Balancer

7KH /%¶V RQO\ UHVSRQVLELOLW\ LV WR WUDQVIHU WKH UHTXHVWV WR WKH ZRUNHUV 2 different approaches were attempted. The first one ± LB running on a VM - was already described in section 3.3. This approach exhibits network bandwidth limitations caused by the bridge. The bandwidth between physical machine on the network representing client and LB VM varies between 250Mbits/sec and 600Mbits/sec (see B.1) and from the LB to the workers is 110-130Mbits/sec (see B.3 and B.4). In order to overcome this issue partially another second approach was tried ± installing the LB on the physical host. By eliminating the data transfer from machine that uses virtualized network device the bandwidths are better ± more than 900Mbits/sec to the LB (see B.2) and around 350Mbits/sec to the public workers on node 1 (see B.5) and 630-690Mbits/sec to the local workers using the NAT-based network (see B.6) from the LB to the workers. However the LB deployed on physical host has one significant disadvantage ± if the host fails, another machine has to be configured instead of only deploying a new Load Balancer VM (LB VM). Improving the LB VM networking can be performed by using several NIC cards or one many-port NIC card.

The configuration of the nginx proxy (i.e., LB) is trivial. It requires a list of servers that will handle the incoming traffic. The default module for load balancing includes Round-Robin (RR) and weighted RR algorithms [62]. Load balancing requests with varying sizes can be accomplished with the fair upstream module [63]. It knows the number of requests served by each server and re-distribute the incoming requests to the least loaded server. For the goals of the project using RR will be sufficient.

Adding new servers to the setup does not require stopping the proxy. The configuration is reloaded by sending a signal to the parent process which kills the nginx worker and starts a new one.

(36)

3.7.2 Web server

Each worker has the application war file deployed in Tomcat and starting on boot. A major impact over the performance of the application and the web servers seems to be the Java heap memory size. By default it is clearly insufficient for memory demanding applications so it has to be increased. The bigger amount of memory that is allocated for the heap the bigger capacity the application has.

Usually the web servers process a large QXPEHU RI FOLHQW¶V UHTXHVWV FRQFXUUHQWO\

However, the current application requires more time to generate response and if many threads are working concurrently on the server this may worsen the performance.

Figure 3.10: Throughput generated by 1 web server for N = 50, 150, 250 for different number of running threads

Tomcat works with a thread pool that handles the requests. Figure 3.10 illustrates the throughput (number of requests served per second) for the application running on 1 web server. If the application is running on OS installed directly on the hardware, the increased context switching will have significant impact over the performance. Despite this, Figure 3.10 shows that the throughput is nearly constant for any number of threads and for any problem size which means that the context switching impact is negligible compared with the virtualization overhead. However, the average time to solve each request is longer with bigger number of threads and in a real application such an effect would be undesirable. Furthermore, if running many threads concurrently and they solve big problem sizes the setup will quickly reach the maximum Java heap size and run out of memory. So in order to keep the capacity of the servers high and the average time to solve each request reasonable, a small thread pool size should be chosen. All the other experiments in the project have been performed with 10 threads working on each web server.

0 5 10 15 20 25 30 35 40 45 50

2 10 25 50 100 150

Throughput (req/s)

Size of Tomcat thread pool

N = 50 N = 150 N = 250

Exploiting Private and Hybrid Clouds for Compute Intensive Web Applications