CLOUD is a metaphor used for the internet and. Network-based Measurements on Cloud Computing Services. Yin Zhang

(1)

Network-based Measurements on Cloud Computing

Services

Vinod Venkataraman Ankit Shah

{ vinodv, ankit, yzhang } @ cs.utexas.edu Department of Computer Sciences

The University of Texas at Austin, Austin, TX 78712-0233

Yin Zhang

Abstract —Cloud computing is widely believed to be a revolution in computing that could soon become an industry standard, altogether replacing the traditional office setup. Due to the recency of these services, question marks exist over the performance of these systems, and consequently, over the corresponding billing schemes that the service provider issues. This study aims at ad-dressing some of these doubts by introducing a measurement test for the network performance of these services, and comparing them with the per-formance offered by traditional web-hosting ser-vices. To do so, we study the response of one such service - Google App Engine - under light, medium and heavy loads generated by Planetlab nodes, and attempt to infer the overall performance of the systems. Our tests indicate that despite offering better round-trip times and throughputs, App En-gine appears to consistently lose large amounts of the data that it is required to send to the clients. We explore this problem, and offer inferences that might explain this erratic behavior.

Index Terms —cloud computing, App Engine, EC2, measurement, round-trip time, bandwidth, throughput, data loss, packet loss

I. Introduction

C

LOUD is a metaphor used for the internet and is an abstraction for the complex infrastructure that it conceals. Cloud computing [4] is a general con-cept that incorporates software as a service (SaaS), Web 2.0 and other recent, well-known technology trends, in which the common theme is reliance on the Internet for satisfying the computing needs of the users. A SaaS (Software as a Service) [9] application runs entirely in the cloud, using the servers of the ser-vice provider over the internet. The client is a simple browser or some other simple client. For example, GoogleApp Engine provides common business appli-cations online that can be accessed from a browser, while the software and the data are stored on Google’s server.

The potential of cloud computing is undoubted, if

suitably deployed. Speculations are that it may lead to a new revolution in computing that could become the industry standard [7]. Analogous to how very few people today prefer to build a house on their own, but rather prefer to rent one, in the next generation of computing, people may prefer to opt for a scalable and reliable provider for their computing needs, that will actually minimize risks while launching a new application, rather than build an entire new enterprise for the purpose of launching products.

Our motivation for this measurement study stems from the hype created around the Cloud Comput-ing concept. Although there is much talk about the advantages of using the cloud, there is no ex-isting measurement study to validate the claims. Also no explicit comparisons have been made be-tween the performance of a cloud computing ser-vice and that of a traditional web hosting serser-vice. Therefore, we study the performance of the Cloud Computing service under varied load conditions to answer if the hype created is actually justifiable. With regards to Cloud Computing, we can clas-sify measurement studies into two broad categories: computation-based measurements and network-based measurements. The computation-based measure-ments include CPU/Process cycles, Memory/Storage, language engine performance (for example: Perfor-mance of Python Engine in the case of Google App Engine). These measurements can only be made at the server level and hence are taken by the ser-vice providers themselves or by authorized third par-ties. The main focus of our work is on network-based measurements of the Cloud Computing service. The three important metrics that we shall be analyzing for the cloud are Round-trip time (RTT), Network Throughput, and Data Loss, using the measurement tool httperf [3].

We attempt to verify the claims made by the ser-vice providers of the cloud computing serser-vices with re-gards to scalability and performance. In order to get a

(2)

better insight, we have used the Planetlab testbed to measure the performance and scalability of the servers of the service providers under the varying load con-ditions. We deployed our image retrieval application on the Google’s AppEngine, and measured the per-formance in retrieving small (12kb), medium (350kb) and large(1 MB) images*. We then tested the per-formance by sending multiple parallel requests from the Planetlab nodes. We repeated the same tests on an ordinary web-host to compare the cloud comput-ing service to that of a traditional web host. The detailed implementation and the evaluation of the re-sults is discussed later in the paper.

The remainder of the paper is organized as follows. We document the related research in Section II. Sec-tion III describes our methodology for the project. Section IV discusses the implementation, followed by the evaluation in Section V. We present the future work in Section VI and our findings are summarized in conclusion in Section VII.

*Note that the image size limitation of 1 MB is be-cause of Google App Engine’s restriction on file sizes.

II. Related Work

Cloud computing is a relatively new concept, and the current services are very nascent. As a result, a very limited amount of literature is also available in the area. Furthermore, no clear standards exist in this industry, and hence each service provider has its own definitions for resource usage. As discussed earlier, measurements on a cloud computing ser-vice can be broadly classified into emphcomputation-based measurements and emphnetwork-emphcomputation-based mea-surements. Our work falls in the latter category. Complementary to our research is the computation-based measurement which includes CPU/Process cy-cles, Memory, Storage, language engine performance and these require access to the servers of the service provider, and hence are carried out by the servers or authorized third parties like Hyperic. Hyperic has a subsidiary company called as www.cloudstatus.com that are the authorized third party measuring ser-vices for the cloud platforms like Googles AppEngine, Amazons EC2. CloudStatus continuously monitors the performance of the Google AppEngines Datas-tore, Memory Cache and Engine performance and re-ports the health of the system.

III. Methodology

This section presents an overview of the current state of cloud computing systems, the kinds of mea-surement tests that can be performed on these sys-tems, and the techniques we use for structuring our test model.

A. Current state of Cloud Computing Services

There are a number of cloud computing services in the market today, each offering a variety of services -ranging from limited, but powerful tools like Google App Engine [1] offers, to the complete server solution that Amazon EC2 [5] offers.

Amazon was one of the first companies to launch a cloud computing service for the general public, and it continues to have one of the most sophisticated and elaborate set of options. For an application, develop-ers can create virtual machines called Amazon Ma-chine Instances (AMIs) with Elastic Compute Cloud (EC2). For storing data, objects of up to 5GB can be stored in the Simple Storage Service (S3). Ama-zon has also built a limited database on top of the S3. The AMIs deployed can communicate among themselves with the Simple Queue Service (SQS), a message-passing API. Most of the tasks performed on the Amazon cloud need a command line. Amazon has a large set of tools with sophisticated security options for sending orders to AMI collection, all run from the command line.

There are now numerous cloud computing services providing features comparable to Amazon EC2. Some of these competitors include Mosso [10], GoGrid and AppNexus.

Google App Engine offers services that are, in some ways, exactly opposite to Amazon’s services. While EC2 provides root privileges, App Engine does not even allow file write options, or much flexibility in storing files and folders. Google removes the file write feature out of Python, presumably as a quick mea-sure to avoid security holes. However, App Engine does provide its own datastore, complete with a cus-tom built query language termed Google Query Lan-guage (GQL) modeled on SQL. All data writes and accesses are expected to be performed using this data-store. Although at first sight, this seems restrictive, the framework provided by App Engine provides the core features on which powerful, content-rich web ap-plications can be built. Further, App Engine provides users with tools to debug the application on a local machine as well. The API that App Engine provides

(3)

makes it ideal for individual and small groups of de-velopers who can code simple database front-ends us-ing Python, although they are likely to expand it for larger enterprises eventually.

B. Measurement Tests

The importance of tests on these services is unde-batable - it is the sole means of describing a pric-ing model for a cloud computpric-ing system. Currently, due to the vast differences in standards between each cloud computing service, there is no standard pricing scheme for a system. Most services, however, do use variations of some popular metrics for pricing. These include CPU utilization, storage space, memory us-age, network bandwidth, etc.

Tests on cloud computing systems can broadly be categorized into two types:

1. Computation-based tests 2. Network-based tests

The former category deals with tests on the actual computational performance of the machines used to run the application on the cloud. Some of these are standardized, such as storage space and memory us-age. However, each vendor specifies their own lim-its and mechanisms for computing CPU utilization. Google App Engine specifies this in terms of Megacy-cles used, a term that remains unclear for us. Amazon EC2 computes this metric in terms of the hours a ma-chine instance has been deployed, and the number of such instances used. However, this category of tests remains out of reach for independent researchers, as it requires root access to the server itself. Such tests are usually performed either by the company itself, or by authorized third parties. One such authorized third party is Hyperic Inc., who have monitor the perfor-mance of both EC2 and App Engine in real-time and publish the results on their website CloudStatus [6].

The latter category is the one that our work fo-cuses on. These tests measure the network perfor-mance of requests handled by applications deployed on a cloud. These include metrics such as round-trip time, network throughput, data loss, bandwidth, de-lay, latency, and many others. Of these, the metrics we chose to test on include the first three. A brief de-scription of each of these metrics is provided below:

Round-trip Time (RTT): RTT is defined as the time elapsed from the propagation of a message to a remote place and to its arrival back at the source. The choice of this metric is obvious - it provides the exact amount of time that a client

accessing a web application would experience as delay in receiving the output of her query from the time of her input.

Network Throughput: The average rate of success-ful data transfer through a network connection is known as network throughput. It is important to distinguish this term from network bandwidth, which is the capacity for a given system to trans-fer data over a connection. Although providers base their billing on bandwidth and not through-put, from a client’s perspective, throughput is more important as it decides the data rate she receives for her request.

Packet/Data loss: Packet loss occurs when one or more packets of data traveling across a computer network fail to reach their destination. Loss can be measured either as loss rate - which detects the amount of data in bytes or as packets lost per unit of time - or simply as loss - the amount of data in bytes that were lost during transfer. This metric is important as it places a quantitative test on the data that a client actually received from the server.

It is important to note that none of these metrics can alone provide a general picture of the performance of the cloud computing service. This will be demon-strated in section Evaluation where each metric is an-alyzed individually.

C. Testing Model

Our test model to measure the network perfor-mance of the cloud computing service begins by de-ploying a web application on both the cloud as well as an ordinary web hosting service. The deployed ap-plication may be tailored specific to the service it is deployed on, but the performance of the application is assumed to be similar in both cases, as the time duration of access of the database of the system is negligible on comparison to the transfer times over the network.

Next, we test the performance of the two services under light, medium and heavy loads using Planet-lab nodes to launch parallel requests. The reasoning behind using Planetlab to generate parallel requests, rather than sending the same number of requests from a single node, is because the latter case tends to be serialized, which does not really test the ability of the service to handle simultaneous requests.

Finally, we measure the performance of the service using httperf [3] to obtain the values for the round-trip time, network throughput, the data loss.

(4)

(a) Small Image (12 kB)

(b) Medium Size Image (350kB)

(c) Large Image(1 MB) Fig. 1. Round Trip Time

IV. Implementation

This section describes the details involved in the implementation of the test model specified in the pre-vious section.

A. The Web Application

The web application deployed on Google App En-gine was a simple image retrieval application. The front end, coded using Google App Engine’s API [2] in Python, performs the tasks of retrieving a collec-tion of static images from an online source, recoding and storing these images in the datastore, and pre-senting an HTML page to the user to request an im-age through an HTTP request, be it from a browser or from any other client.

From a regular web host, ordinary GET requests to locations referencing static images are used.

B. Measurement Tests

The httperf binaries were deployed on 100 Planet-lab nodes around the world and were used to perform tests under various different load conditions. These included sending single and multiple requests from individual nodes separately, as well as sending single and multiple requests from multiple nodes parallely. For the statistics presented in the next section, the 100 Planetlab nodes were programmed to send 1, 10 and 100 requests each in parallel and maintain logs for the results of these requests, so as to subject the servers to a wide spectrum of loads, while staying un-der the limit of the maximum bandwidths allowed by these servers.

From the data registered in the logs of each of these servers, the values for round-trip time, average net-work throughput and percentage of data loss were obtained.

V. Evaluation

In this section, the results of the experiments con-ducted are evaluated and interpreted.

A. Metrics evaluation

A.1 Round-Trip Time (RTT)

The Round Trip Time gives the total end to end time, and hence is an important metric in evaluat-ing the performance of the Cloud Computevaluat-ing service. Fig. ?? shows that for small sized image (12 kB), the performance of the Google App Engine is a touch better as compared to the traditional web host. How-ever, the picture becomes more clear as we go on to

(5)

(c) Large Image(1 MB) Fig. 2. Throughput

the medium sized image (350kB). The second graph shows that the RTT for Google App Engine is an order of magnitude faster as compared to the tradi-tional web host. The third graph shows that for large images(1 MB), the performance of the App Engine is a couple of orders of magnitudes faster as compared to the traditional web host. Thus, given the results of the RTT only between the App Engine and the traditional web host,the hype created for using the cloud seems to be justifiable. However, we shall eval-uate the results over different metrics to get a better understanding of performance and scalability in the cloud.

A.2 Network Throughput

Now, we discuss throughput and hence the scala-bility of an application deployed on the Cloud Com-puting Service. Scalability means that the through-put available for an application should increase if the load/requests for the application increase. Fig. 1 shows that in the case of small images(12 kB), there is not much of a difference between the Google’s App Engine and the traditional web hosting service, Syn-thasite. However, in the case of a medium sized image (350 kB), Google App Engine clearly seems to have a better bandwidth as compared to Synthasite, how-ever Synthasite seems to scale well under increasing load. The difference in bandwidth and scalability is more pronounced in the case of large images (1 MB), where the traditional web hosting service like Syntha-site scales ideally for increased loads whereas Google’s App Engine does not seem to scale. From the results we can conclude that as regards bandwidth, Google’s Cloud Computing Platform definitely has more band-width as compared to a traditional web host but it does not seem to scale as well as the traditional web host, Synthasite in this case.

A.3 Data Loss

Now, having seen the impressive results of RTT, its time for a reality check. The Data/Packet loss is measured in percentage and gives the amount of data that has not been accounted for when the RTT gets calculated. For example, in the case of transferring a image of X kB and sending 100 request per Planetlab node, we see that each fo the request returned from the Planetlab node as y. The percentage of data loss is given by [1-(x/y)]*100. We expect the data/ packet loss to be as low as possible, ideally 0. Fig.1 shows that even for small image(12 kB), Google App Engine gives an error rate of 12

(6)

(c) Large Image(1 MB) Fig. 3. Data Loss

B. Result Interpretation - What really happens?

The values in Section A.3 indicate that App En-gine performs exceedingly poorly under heavy loads, contrary to claims to the opposite by Google. On consideration, we realized that these results may not be indicative of App Engine’s actual performance as our usage of httperf may not have been appropriate for this setting. This is elaborated as follows.

At zero load, App Engine will not dedicate much server resource to an application, letting a single server monitor the application. When this server is subjected to an extremely heavy load, the single App Engine server appears to make connection and service every request that arrives to an application at least partially, regardless of the number and size. In the meantime, it appears to be calling for assistance from the other servers in the cluster in order to distribute the load efficiently.

This would probably result in a delay in servicing a request for the client. When the client is a measure-ment tool like emphhttperf, under normal parame-ters of call, the client assumes that the server has completed the request and effectively times out be-fore the backup servers arrive to continue processing the requests. On the other hand, with a more ro-bust client like a browser, a slightly longer delay is permissible.

In order to prove the above conjecture, we would have to conduct further tests using httperf by varying the time-out periods.

C. Effect of Geographic Location

To test the effect of geographic location of the client on the performance of App Engine served re-sults over a set of 16 nodes in locations from Europe, Asia, South and North America. These tests indi-cated that the slowest performers turned out to be from third-world countries with poor network band-width. Although these clients fared poorly with the ordinary web-host, they managed to complete the en-tire transfer of data. On the other hand, their per-formance with App Engine was so poor, that the re-quests did not even manage to complete, and timed out essentially. The inference is that countries with lower bandwidth availabilities should stick to ordinary local web-hosts for clients in their own countries, and may use App Engine to serve clients abroad.

(7)

VI. Future Work

We believe that a commercial giant like Google would not market a product that performs so poorly on the critical issue of data loss. Hence a part of the future work would be exploring the data loss phe-nomenon further by tweaking with the tool emph-httperf so as to adjust the timeout value so as to wait for the Google’s servers as they come in to balance the increasing load. Also, there are metrics other than we have studied that can help decide the per-formance of the Cloud Computing Service, for exam-ple, Latency, i.e the time between the image being requested and the image being retrieved. Also, our application has been deployed only on the Google’s App Engine. However, we can deploy it on any other cloud computing service by making use of their APIs. This would help us compare the performance of the various service providers. Finally, we need to test the performance of the Cloud Computing services for some real time applications.

VII. Conclusion

In this paper, we have introduced the concept of network-based measurement on cloud computing services and justfied their importance from a client point of view. To this end, we have developed a vendor-independent network-based measurement test on cloud computing services, using freely available open-source tools and resources for testing and valida-tion. We have tested the performance of Google App Engine under varying load conditions using Planetlab, and compared it quantatively to the performance of an ordinary web-host handling the same requests.

This paper presented a premliminary idea of the kind of tests that can be performed on cloud comput-ing services, and further testcomput-ing.

VIII. Acknowledgments

We would like to thank Professor Yin Zhang, our project guide, for his invaluable advice and feedback without which this project would not have been a suc-cess. We are also thankful to Professor Mike Dahlin and Han Hee Song for their help with Planetlab.

References

[1] Google App Engine, http://appengine.google.com/

[2] Documentation for Google App Engine,

http://code.google.com/appengine/docs/

[3] Httperf tool,http://www.hpl.hp.com/research/linux/httperf/

[4] Wikipedia - Cloud Computing

http://en.wikipedia.org/wiki/Cloud computing/

[5] Amazon Elastic Cloud, http://aws.amazon.com/ec2/

[6] Hyperic CloudStatus, http://www.cloudstatus.com/

[7] Rajkumar Buyya, Chee Shin Yeo, and Srikumar Venu-gopal, Market Oriented Cloud Computing : Vision, Hype and Reality for delivering IT Services as Computing Utili-ties

[8] Twenty One Experts define Cloud Computing,

http://cloudcomputing.sys-con.com/node/612375/print/

[9] Cloud Computing: The

Evo-lution of Software-as-a-Service,

http://knowledge.wpcarey.asu.edu/article.cfm?articleid=1614

[10] Mosso Cloud Hosting, http://www.mosso.com/

[11] What cloud computing really means,

http://www.infoworld.com/article/08/04/07/15FE-cloud-computing-reality 1.html