The candidate confirms that the work submitted is their own and the appropriate credit has been
given where reference has been made to the work of others.
I understand that failure to attribute material which is obtained from another source may be
considered as plagiarism.
(Signature of student) ___________________________
A Cloud Computing directory
service
Jack Rollinson
BSc (Hons) Computer Science
Session (e.g., 2000/2009)
i
Summary
This report outlines the investigation into Cloud Computing and the services which make use of this paradigm. This is done in the form of a thorough analysis of the present state of Cloud Computing and with it, a classification of the three main areas – Software as a Service, Platform as a Service and Infrastructure as a Service. Following that, this classification is used as a basis to form a database which is used as a directory of Cloud Computing services, accessed by a Graphical User Interface. The process is then evaluated along with the solution it helped create, and the implications the directory has on Cloud Computing resource brokering is discussed.
ii
Acknowledgements
To the project supervisor Peter Dew, for his continued support and guidance – even through tough times. To Mo Hadji for his wise words and valued help. To Mark Walkley, who provided valuable guidance and reassurance. But especially to my parents, because without their help and support there is no way this project would have got this far.
iii
Contents
Summary...i Acknowledgements...ii Chapter 1 – Introduction ...1 1.1 - Aim...1 1.2 – Objectives ...11.3 – Minimum Requirements and Extensions...1
1.4 – Deliverables...2
1.5 – Relevance to Degree Programme...2
1.6 – Initial Project Schedule... 2
1.7 – Report Structure... 3
Chapter 2 – Background Research...4
2.1 – Overview...4
2.2 – What is Cloud Computing?...4
2.3 – Why Cloud Computing?...6
2.4 – Cloud Computing Services...6
2.4.1 – Software-as-a-Service...7
2.4.2 – Infrastructure-as-a-Service...7
2.4.3 – Platform-as-a-Service...8
2.5 – Problem and Related Work...8
Chapter 3 – Project Methodology and Preparation...11
3.1 – Introduction...11
3.2 – Choosing the Methodology...11
3.3 – Initial Analysis Methodology...11
3.4 – Software Development Methodology...12
3.4.1 – Methodologies Considered...12
3.4.2 – Methodology Chosen...13
iv
3.5.1 – Development Language.....14
3.5.1.1 – Development Languages Considered...14
3.5.3 – Development Language Conclusion...15
3.5.2 – Database Technologies...15
3.5.2.1 – Database Technologies Considered...16
3.5.2.2 – Database Technologies Conclusion......16
Chapter 4 – Analysis and Classification...17
4.1 – Introduction...17
4.2 – Finding Relevant Information...17
4.3 – Software-as-a-Service Analysis...18 4.3.1 – Software-as-a-Service Classification...20 4.3.2 – Classification Example...21 4.4 – Platform-as-a-Service Analysis...21 4.4.1 – Platform-as-a-Service Classification...23 4.4.2 – Classification Example...23 4.5 – Infrastructure-as-a-Service Analysis...23 4.5.1 –Infrastructure-as-a-Service Classification...26 4.5.2 – Classification Example...26 Chapter 5 – Design...27 5.1 – Introduction...27 5.2 – Requirements...27 5.2.1 – Scenario One...27 5.2.2 – Scenario Two...28 5.2.3 – Scenario Three...28 5.2.4 – Summary of Scenarios...28 5.2.5 – Requirements...29 5.3 – Database Design...29 5.3.1 – Data Modelling...29 5.3.2 – Database Design...31
v
5.4 – Graphical User Interface Design...32
5.4.1 – Layout of Interface...32 5.4.2 – Usability...32 5.4.3 – Data Entry...32 5.4.4 – Screen Layouts...33 Chapter 6 – Implementation...34 6.1 – Introduction...34
6.2 – Increment One – Database Implementation...34
6.3 – Increment Two – Start Up Screen and SaaS...35
6.3.1 – Layout of Start Up Screen...35
6.3.2 – Functionality of Start Up Screen...36
6.3.3 – SaaS Implementation...37
6.3.3.1 – Requirements Screen Layout...37
6.3.3.2 – Requirements Screen Functionality...37
6.3.3.3 – Feedback Screen Layout...38
6.3.3.4 – Feedback Screen Functionality...38
6.3.3.5 – More Information Screen Layout...38
6.3.3.6 – More Information Screen Functionality...39
6.4 – Increment Three – PaaS...39
6.4.1 – PaaS Implementation...39
6.4.1.1 – Requirements Screen Layout...39
6.4.1.2 – Requirements Screen Functionality...39
6.4.1.3 – Feedback Screen Layout...40
6.4.1.4 – Feedback Screen Functionality...40
6.4.1.5 – More Information Screen Layout...40
6.4.1.6 – More Information Screen Functionality...40
6.5 – Increment Four – IaaS...41
6.5.1 – IaaS Implementation...41
vi
6.5.1.2 – Requirements Screen Functionality...41
6.5.1.3 – Feedback Screen Layout...41
6.5.1.4 – Feedback Screen Functionality...42
6.5.1.5 – More Information Screen Layout...42
6.5.1.6 – More Information Screen Functionality...42
Chapter 7 – Testing...43
7.1 – Introduction...43
7.3 – User and Integration Testing...43
7.4 – Conclusion...43
Chapter 8 – Evaluation...44
8.1 – Introduction...44
8.2 – Evaluation Criteria...44
8.3 – Aims and Objectives...44
8.4 – Evaluation of Methodologies Used...45
8.5 – Evaluation of Analysis and Classification...45
8.6 – Evaluation of Directory Service...47
8.6.1 – Evaluation of Technologies...47
8.6.2 – Evaluation Against Requirements...47
8.6.3 – User Evaluation...48
8.7 – Implications on Resource Brokering...48
8.8 – Future Enhancements...50
8.9 – Conclusion...…...50
References...51
Appendix A – Personal Reflection...56
Appendix B – Project Schedule...58
Appendix C – Classification Examples...60
Appendix D – Classification Examples...61
Appendix E – Classification Examples...62
vii
Appendix G – UML Diagrams...65
Appendix H – Detailed ER Diagram...68
Appendix I – System Interface Layout...69
Appendix J – Screen Layouts...70
Appendix K – SQL Scripts...75
Appendix L – Additional Implementation Details...78
Appendix M – Testing...87
1
Chapter 1 – Project Overview
1.1 Aim
The aim of this project is to discuss and assess Cloud Computing and services that use computing in the Cloud to provide a classification of Cloud Computing and the services that utilise this paradigm. This classification will aim to incorporate all areas of Cloud Computing in its current state, and will lead to the discovery of suitable differences and similarities between services that operate’ in the cloud’.
With this classification, there is scope to design and implement a database representing this classification – coupled with a designed and implemented Graphical User Interface (GUI), to act as a directory of Cloud Computing services. The aim of this directory is to store a wealth of different services, and will provide a user with correct, suggested services based on and satisfying various user-specified requirements. Following this, this project will aim to discuss the implications of this directory on a resource brokering process and the feasibility of resource brokering in Cloud Computing in general.
1.2 Objectives
There are several objectives that need to be fulfilled throughout this project. Initially, it is
fundamental to understand what Cloud Computing is, what it represents now and what its likely future is. A firm grasp and understanding of what is a new, complex and dynamic area of computing is paramount to the success of the further objectives and requirements, and also the whole project itself. In addition to this, it is necessary to become familiar with the different areas of Cloud Computing and the services in each area to try and gauge what it is they offer and the commonalities and differences between each service to form the basis of the classification. It is also important that any suitable field-specific terms are fully explored and understood, for example ‘Service Level Agreements’ (SLAs) so that any classification based on new terms is not misguided.
A further objective of this project is to understand the use of resource brokers in other areas of computing and how these can be related to use in the cloud. This is a very important step as this will form a part of the evaluation. In addition, it is necessary to select the most appropriate development
methodology and technologies in order to produce the directory database and software. There is also the objective of then producing this directory software that will provide the user with suggested Cloud Computing services based on various, differing, specified requirements
1.3 Minimum Requirements and Extensions
The Minimum Requirements of this project:2
• A classification scheme for Cloud Computing services (will be shown in report).Regular meetings with the project supervisor resulted in these requirements being altered and refined on a number of occasions. This meant that they would be better specified and a more appropriate project produced.
The extensions are:
• Design, build and test a Cloud Computing directory service for the selection of the most suitable Cloud vendor and service to meet a user's requirements (is a deliverable). • Evaluation of the Cloud Computing directory service and its implications on
resource brokering.
1.4 Deliverables
The main deliverable is a solution to act as a directory of Cloud Computing services, containing a wealth of these services based on the classification of Cloud Computing and services developed prior. This comes in the form of a database and GUI application that enables a user to select specific
requirements for a service in a particular area of Cloud Computing, after which the software will respond with the most suitable service(s) based on the requirements. This should provide a ‘complete’ solution to the problem in hand, and will enable the discussion of the implications the directory service will have on Cloud Computing resource brokering.
1.5 Relevance to Degree Programme
This project includes and covers many areas relevant to my Computer Science degree. Cloud Computing itself is an aspect of distributed computing and this was briefly covered in modules in this area. The project involves database design and implementation which has been covered in all three years of my degree programme. There is also the aspect of Object-Oriented programming which forms a main part of this project; this being something taught continuously throughout the degree. Also, as this project incorporates a GUI – Human Computer Interaction (HCI) fundamentals and techniques are covered.
1.6 Initial Project Schedule
The initial project schedule is specified in Appendix B 1. Project scheduling was imperative to this project due to the time constraints provided. There was a substantial amount of work to be completed for every milestone and this work had to be tightly scheduled to definite deadlines, thus ensuring aspects of the project were completed in an appropriate, timely manner. The schedule is represented in text form to clearly show the order and duration of each task and milestone.
3
The original schedule had to be revised due to changes in the way the project was defined, requirements and the developer being unable to keep in line with the schedule for some of the specified milestones. As the original schedule underwent many changes, a revised schedule of events and milestones was created. The revised schedule of how the project was eventually actually completed is specified in Appendix B 2.1.7 Report Structure
The structure of the report starts with the overview of the project, the problem and the aims and objectives for the project. The structure from then on is similar to the project lifecycle. The background research is outlined, followed by the discussion and selection of the appropriate methodologies and technologies used in the project. Succeeding this is the critical analysis and classification of Cloud Computing and services. Then, there is the design, implementation and testing of the directory service produced. Finally, a thorough evaluation of the directory service and process followed is described, the achievements and limitations of the directory service in the changing ‘climate’ of Cloud Computing are outlined and the implications the directory service has on computational resource brokering in Cloud Computing are explored.
4
Chapter 2 - Background Research
2.1 Overview
This chapter discusses what Cloud Computing exactly is and how it has emerged to be regarded as one of the most promising aspects in the computing world to date. There is also discussion on its advantages and how it could radically change the way IT services are approached and delivered. Relevant research into specific cloud computing services; how they work and what they offer is conducted to explore the state-of-the-art in current cloud technologies and services.
Finally, the background research concludes with a review of brokering in other computing models and a consideration of how ideas from there could be used to inspire the creation of a directory of cloud computing services, based on a classification of these services with the aim that the directory would provide suggested services to the user based on their specified requirements.
2.2 What is Cloud Computing?
Many computing paradigms have emerged in recent years and have set out to innovate and sometimes even revolutionise the way computing is done in the modern day. The first real step towards what we have now was Cluster Computing which is in essence a group of interlinked computers working together to perform a single task, interacting in a way that represents the form of a single machine
working. This is done through the combination of the different computers’ processing power and memory etc. so a task can be completed in a much shorter time than if one machine was to process the task. This idea of distributed computing spawned another major computing paradigm - Grid Computing.
Grid Computing, like cluster computing, is a form of distributed computing but is used specifically to perform very large-scale, resource intensive problems in science, engineering and commerce [1]. It does this via the use of many geographically dispersed high performance resources. These can be of many different types. Resource intensive problems otherwise not (efficiently) solvable with single machines or clusters of machines with can be solved collaboratively through use of Grid Computing. Resources in this instance can refer to a component in a computer system that can be made available to perform these distributed tasks. These resources could represent access to RAM (Random Access Memory), hard disk storage, access to data stored or processer usage, amongst other components.
Another paradigm which has taken shape recently is Utility Computing. Utility Computing can be viewed much in the same way that we pay for utility services in society – service providers supply
utilities such as gas, electricity and water and charge pro rata to what is used [2]. It is clear to see then that Utility Computing represents a model where users pay service providers for computing resources when
5
they need to. A new paradigm which uses some elements of Grid Computing and Utility Computing is Cloud Computing.Cloud Computing, due to its large encompassing nature can be defined in many ways. [3] Defines that the term Cloud Computing describes both a platform and a type of application. It is a platform in that a Cloud Computing platform provisions and configures servers when needed. It is a type of application in that cloud applications are applications that are extended to be accessible on the internet. While this is true, it does not describe the entirety of the term. [4] Defines that the concept of cloud computing is basically the usage of vast computing resources that reside on the internet, but I feel this is too basic. [1] Defines that “A Cloud is a type of parallel and distributed system consisting of a collection of
interconnected and virtualised computers that are dynamically provisioned and presented as one or more unified computing resources based on service-level agreements established through negotiation between the service provider and consumers” and thus cloud computing is computing which incorporates this cloud, which is solely accessible through the internet. This is a very good definition. The key statement in this is that Cloud Computing is performed through the Internet. As Cloud Computing is such a new and rapidly developing technology there is still widespread debate on a set-in-stone definition for Cloud Computing but I will use the one presented above from [1] as a basis for this Project.
Cloud Computing is more than a combination of Cluster, Grid and Utility Computing. Some services of Cloud Computing do rely on the power of Grids and some Cloud Computing services are charged for in the same way as Utilities. But in essence, Clouds can be seen as next-generation data centres with nodes “virtualized” in different ways through hypervisor technologies such as VMs (Virtual Machines) which are made available to users when they need them [1]. The VMs, with physical computer machines, work to perform specific internal tasks for a user based on a service level agreement (SLA) which is agreed prior. To explain, the data centres can be visualised to be centres containing vast numbers of physical machines – supercomputers and computing servers that provide the resources for tasks. Virtual Machines can be seen as software on a machine that acts like hardware i.e. multiple ‘instances’ of virtual machines can be started on a physical machine that perform and execute different tasks as if they were physical machines [5]. This process allows flexibility in that more than one machine can respond to a variety of differing service requests.
There are different ways in which virtualization can take place [6]. Examples of virtualisation are Operating System (OS) level virtualisation which means that several ‘virtual’ operating systems can be executed on one machine. Virtualisation can also occur at Application level which means that specific applications are ‘kept separate’ from the underlying hardware and operating system so that applications that are foreign to the underlying hardware and operating system can be run. Server virtualisation is a similar concept but server users are kept masked from the physical attributes of a server. Platform
6
virtualisation describes the act of masking the physical characteristics of a computing platform for the users and replacing it with a virtual platform on a VM [6]. With virtualisation, virtual machines are completely isolated from other virtual machines and so if one crashes, the others are not affected. Also, with virtualisation, computing resources are in a pool allocated to each virtual machine when it needs them, in a controlled manner [7].SLAs are agreements between the user and the Cloud Computing service provider that defines what service is to be performed, at what cost and how failures to provide this service will be dealt with.
2.3 Why Cloud Computing?
As computing becomes ever more ubiquitous, there is a demand to have access to data, files, resources and applications 24 hours a day from both a personal and business point of view. As Cloud Computing is performed through use of the internet (which itself is usable 24 hours a day) Cloud
Computing can supply this demand. Cloud Computing can allow for the storage of data in a Cloud on the internet and for the access of data in a Cloud on the internet. It can also allow for an application to be hosted, provided and used as a service on the internet - all of the time.
Furthermore, as with Grid Computing, Cloud Computing can allow for access to CPU cycles, use of memory and hard disk space on very large, very powerful computers. This all ties in with the notion of a very ‘thin client’ in that the necessity for personal computers and computers in business to be powerful will become increasingly unimportant as storage, access, hosting and compute processing can be done by a Cloud. Business computer infrastructures are likely, as a result, to need to be increasingly less and less complex as large data centres with more complex, more powerful infrastructures will be able to service various tasks. Although typically users will be charged per use of different services in the cloud; it is clear to see the cost-effectiveness of the ability to use a thin client.
The services and resources used have become independent of location, and connections, services and resources can be accessed ‘on-demand’. As Clouds are run in data centres on large numbers of very powerful physical machines using instances of virtual machines application developers can run
applications that are scalable (can easily grow), have high-performance (work very efficiently) and be reliable (fail very rarely) [8].
2.4 Cloud Computing Services
Since its inception, many major and smaller companies have developed a variety of different Cloud Computing services for public use. Cloud Computing aims to deliver IT services as computing utilities [1]. Although definitions are sometimes blurry and sometimes overlap, they can typically be categorised as follows:
7
• Software as a Service (SaaS)• Infrastructure as a Service (IaaS) • Platform as a Service (PaaS)
2.4.1 Software as a Service
Cloud Computing is closely associated with Web 2.0 (social trends in using the World Wide Web for information sharing and collaboration) in that it provides a means for hosting and accessing online (on the internet) applications which are known as ‘Software as a Service’ or SaaS. Examples of existing SaaS include tools for project management (Clarizen [9]), tools for customer relationship management
(Salesforce.com [10]) and office applications (Google Apps [11]). Software provided as an online service enables users to access, edit, delete and upload information related to that particular service from any device with an internet connection. As the software is provided on a Cloud server, thus removing the necessity to install and execute applications on the client’s (user’s) machine.
SaaS’ are built on top of SaaS platforms that host the software. The SaaS platform handles
hardware and software resources for the service, allowing the software to run. Many other SaaS providers exist, details of which will be presented later in this report when Cloud Computing services are detailed and classified when producing the solution.
2.4.2 Infrastructure as a Service
Previously named ‘Hardware as a Service’, IaaS describes when computer processing capacity (use of CPU cycles, RAM, network equipment, storage in the data center space and other hardware) is purchased in a similar style to utility computing explained previously. A user will specify desired
requirements as to which infrastructure component to use, what amount to use, they will decide how long for and sometimes specifications for reliability and security. Then, the IaaS provider will provide the desired service based on a particular Service Level Agreement (SLA) that the provider defines. A major example of this is the Elastic Compute Cloud (EC2) provided by Amazon [12].
The EC2 allows users to ‘rent’ computers from Amazon to run the user’s own applications. Users of the EC2 can purchase processing power online on the basis of specific processor cores, storage space and data transfer. For example, a user can choose between: a Small Instance with 1.7 GB of memory, holding 1 EC2 Compute Unit (1 virtual core with 1 EC2 Compute Unit) (where 1EC2 Compute Unit provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor), 160 GB of instance storage, on a 32-bit platform operating system, and various other instances with differing processing power, memory, storage and platforms. The ability to configure, increase or decrease the infrastructure requirements (in EC2 and other services) very quickly prior to usage and the flexible nature
8
derived from the fact that each users can choose the specification of each individual instance ofcomputing power purchased.
There are other advantages in using IaaS’ in that it removes up-front costs for the user by removing the need for users to utilise their own costly powerful machines and machine components and allows users to use a large data centre’s infrastructures that are extremely powerful [13]. Many other IaaS providers exist, details of which will be presented when Cloud Computing services are detailed and classified when producing the solution. Infrastructure as a service encompasses all compute processing, servers, networking components, storage and database requirements in Cloud Computing.
2.4.3 Platform as a Service
PaaS is seen as a step up from SaaS in that the PaaS model allows for the creation and delivery of web applications and services, all done through the internet. A PaaS should in essence support the whole application development life-cycle not just developing, deploying and hosting, but also testing
applications and also maintaining them after deployment. PaaS can be seen as SaaS combined with IaaS, with an extra incentive for users in that the whole development life-cycle can be realised online in a single platform. A drawback of PaaS’ is that depending on which cloud computing service provider is chosen (for example Force.com from SalesForce [14]) you are constrained to learning and using the proprietary development language of this provider and in then the application itself is tied to the language of this provider. However, benefits of PaaS’ are clear to see. Not only does it allow for development and deployment of services but as the PaaS provider offers its infrastructure to host the application the benefits of a powerful data center on the side of the provider for increased efficiency, security and reliability removes the cost and problem of maintaining these aspects for the users themselves.
2.5 Problem and related work
The problem for this project, then, arises from the fact that there are three different cloud computing services offered, whether they are SaaS, IaaS or PaaS. Within these varying cloud computing services offered there are many services offered which can differ greatly or be quite similar to one another. Also, the numbers of providers of Cloud Computing services, and hence numbers of cloud computing services are rapidly increasing. So; how does a user, personal or business, decide which cloud computing service provider and cloud computing service to use for a particular job or task?
In other paradigms such as Grid Computing, the focus is on aggregation of resources to solve large-scale problems. More relevantly, Grid Computing is concerned with an equivalent to Infrastructure of a Service in Cloud Computing in that Grid Computing harnesses the power of many different types of machines and hardware. In this way, resources are merely ‘compute resources’ which are always of the
9
same type i.e. have certain processing power, memory capability, network components and architecture etc. and so resources can be brokered and then selected and secured using a resource broker that selects the most best resources for a particular job based on certain specified requirements and other Quality of Service requirements. A user will state their requirements and an SLA will be negotiated betweenresource provider and user after a resource broker has found suitable resources based on requirements of a user and a ‘knowledge base’ which holds the attributes on what a particular resource provided by a resource provider will offer (including values based on previous reliability of the resource if used before). Such an approach is proposed in [15].
However, in Cloud Computing an approach as specified in [15] is not possible. This is because there are many more types of ‘resource’ and service provided that fall under many different categories, for some of which (for example SaaS and PaaS) ‘selecting’ and ‘securing’ of resources is not applicable. Furthermore, with cloud computing in its current state, even for the IaaS category of cloud computing services such a broker would not be applicable. As explained in [1] there would need to be a market-oriented approach to Cloud Computing for use of a broker to be viable. As of today, even the state of the art Cloud Computing service providers do not provide support for negotiation of Quality of Service (QoS) between users and providers to establish SLAs. Instead, only some Cloud Computing providers include
the use of SLAs in their technologies. SaaS’ and PaaS’ may provide a certain QoS but they do not support negotiation or user-specification of one. Even with IaaS’ like Amazon EC2; a user can specify a certain requirements but only under certain fixed parameters and the price pre-set and nonnegotiable. Using a resource broker like in [15] for IaaS’ is out of the question in the current state of Cloud Computing where automatic resource selection, acquiring and usage would not be possible as services like Amazon EC2 require the use of their website for logging-in selection of service and payment.
The need for the resource brokering process in Grid Computing is down to the fact that there are compute-intensive tasks to be done which can be carried out on geographically distributed infrastructure. Then the problem lies, given a particular task, which compute resource is used at any one time. Factors [18] such as performance (the capabilities of the infrastructure used), access availability or restriction, cost of execution and resource reliability are integral to performance on the Grid and a resource broker should take these factors into account when deciding which resources to make use of. Although this is only appropriate for IaaS (the closest parallel in Cloud Computing to Grid Computing), some of these factors should be used as the inspiration in a classification for IaaS services. Part of a resource broker’s ‘job’ is to suggest resources for a particular task and so the directory service for IaaS should go some way to doing this. So, this part of the classification should be done with aspects of resource brokering in mind, because the IaaS part of the directory service will carry out some similar functionality to a traditional
10
resource broker and also as the implications of the directory service implemented on resource brokering will form part of the evaluation.A future broker for a time when Cloud Computing has been further developed is presented in [1]. This approach requires current (specifically IaaS) Cloud Computing providers to develop interoperability so that a single job could use multiple resources at different sites. The development of a Global Cloud Market where storage and compute resources are given a set price (that can change) based on market conditions, user demand and current level of utilization of this resource. A broker in this instance, then, would act in a similar fashion to how a broker acts in real-world markets in that the broker would gauge price of resources via the current market conditions and then proceed to buy certain compute capacities from providers and then sub-lease these to end-users. During which; brokers and providers would be bound to users with certain requirements in the form of SLAs which would be classified in terms of metrics and penalties for not meeting the agreements would ensue. However, as this level of
interoperability and market-oriented cloud computing is not a reality (yet), an implementation of this is not possible.
So, there remains the problem; how does a user decide which cloud computing service provider and cloud computing service to use for a particular job or task? The focus of this project will be to address this problem. It can be done with the use of a Cloud directory service that selects the most appropriate cloud computing service from a wealth of cloud computing services and providers based on the specific type of service, requirements the user requires.
In preparation for the design and implementation of a directory service, an analysis and
classification of each area of Cloud Computing needs to be done so that the important features for which the services can be differentiated by in each case can be decided. This step is the most important, as if the classification is incorrect or vague; the directory service will also be incorrect and unsuitable.
There are SaaS directories located on the web such as [16] and [17] where a user can use different categories to find a particular service they may need. Although these categorisations do provide a faster means to find the most appropriate service for a certain need, the categorisations are not consistent, thoroughly thought through and are certainly not perfect - sometimes you will find IaaS’ and PaaS’ in amongst SaaS’.
It appears there is no existing tool that will select a cloud computing service, be it IaaS, PaaS or SaaS for a user depending on particular requirements for this service. My directory service will use the implementation of a classification directory similar to those seen in [16] and [17] but for all three of the distinct types of Cloud Computing services. This means that a user will be able to find out quickly which is the most appropriate Cloud Computing service for the job that they wish to achieve, differentiating between many different types of services and also between many of the services themselves.
11
Chapter 3 - Project Methodology and Preparation
3.1 Introduction
This chapter will discuss and detail the possible methodologies and technologies to be used in this project to manage and develop the classification and directory of Cloud Computing services. Possible methodologies and technologies which were not chosen are examined and explanations are provided to illuminate the process of choice.
3.2 Choosing the Methodology
Utilising the correct methodology is fundamental to being able to manage a project properly. It is the process that will guide the developer through the different stages of the project in a feasible, suitable order. [19] states that one of the reasons many projects fail or have a negative outcome is due to the lack of a suitable project management methodology. This project can be effectively seen to be ‘split into two parts’ with a detailed analysis leading to a classification followed by the design and implementation of a directory service. As this is so, it seems reasonable that the project methodology be split in two parts as well. A bespoke methodology in this instance seems appropriate, firstly with a thorough analysis of Cloud Computing services leading to the classification scheme and secondly following a suitable software project management methodology for the construction of the directory service.
Using this two-part methodology is most suitable for this style of project as any detailed requirements for the directory service or indeed any discussion with regard to the content, design and implementation of the directory service can only be done after the analysis and classification scheme have been completed. Therefore, incorporating the detailed analysis for the classification into the software project management methodology does not seem appropriate, and even though some of these
methodologies may contain an ‘analysis’ phase; these tend to be more focused on analysis of existing systems or requirements of end-users and do not fit the purpose of a much broader, more detailed analysis as what will be done to present the classification scheme.
3.3 Initial Analysis Methodology
To create a classification scheme for Cloud Computing services, as much relevant information as possible regarding different services in the three areas of Cloud Computing is needed to be obtained, understood and digested so that effective decisions can be made on the main features, commonalities and differences of the services. As Cloud Computing is such a new field there is little information available of this nature to aid this process. Also, as Cloud Computing is such a rapidly changing field – much
12
Computing services resides on the internet and often comes in the form of suppositions and can also involve conjecture. With this in mind, and as this classification scheme focuses on the key features of the actual services within each area of Cloud Computing, it seems suitable to propose a methodology of a detailed analysis of the websites of the Cloud Computing service vendors themselves. Studying a large amount of these websites will allow the developer to identify key features, features which are common across services in a particular area and also work out important differences between services within their respective areas of Cloud Computing. This analysis will also give insight into how particular services in the cloud operate, giving the developer further understanding of the area which will only serve to enhance the classification scheme.The analysis can be seen to be divided into three sections, one for each area of Cloud Computing. Services and vendors will be analysed fully, one by one, firstly for Software-as-a-Service, then for Platform-as-a-Service and finally for Infrastructure-as-a-Service. Splitting the analysis up in this way means that the focus can be put entirely on one area at a time and the most suitable classification scheme at each step can be provided without confusion.
3.4 Software Development Methodology
Various methodologies that will govern and help manage the production of the directory service software are to be considered for their suitability to this project. Each methodology has its positive and negative aspects, and it is probable that more than one would address the demands of this project.
3.4.1 Methodologies Considered
Waterfall Model
The waterfall model can be seen [20] as the classical model used in developing systems. The waterfall model is split up into set, ordered stages, each of which has to be completed before the next is started. This seems a logical approach in the creation of a system but it also means that if at any later stage in the model some extra work for a previous stage is realised, then its inflexibility shows. It would be advantageous to use this model in that it is very simple and stages of analysis, design, implementation and testing are separated and can be focused on fully at each phase. However, in this project, there are three areas which will all have differing requirements representing the different areas and so it does not seem reasonable to specify all requirements at the start of the model lifecycle in this way.
Rapid Application Development
Rapid Application Development [21] is a model that aims to speed up the process of the
13
in that earlier stages can be returned to and refined depending on changing needs and requirements of the system. The emphasis on ‘rapid’ means that this methodology is focused on a very fast way of developing a system. Prototypes are created at each stage of development and this is done so that end-users can be involved in the process to make sure that the ‘look’ of the system is in line with their requirements. However, with this speed of development comes a price where functionality of the system is sometimes ignored or missed out because of the concentration on the quickness of development. In this project, there are no real ‘end-users’ to communicate with and the developer is creating a proof of concept system to the liking of the developer. This means that, effectively, the developer is the end-user and there is no need toprototype.
Incremental Development Model
Another methodology that is widely used is an incremental methodology. An incremental methodology [22] means that rather than delivering the system as a whole at the end of the process; implementation parts are broken down into increments where each of the increments deliver part of the required functionality of the system. Each increment should be tested on its own and then tested for its integration into the system if it is a further increment than the first. Splitting the process up in this way means that significant differing parts of a system can be handled on their own and not all at once, like you would have to with the Waterfall model.
3.4.2 Methodology Chosen
It is decided that the methodology to be incorporated in this project is the Incremental
Development Model. As with the analysis, which can be seen to be split up into three parts; each differing because of their focus on a different area of Cloud Computing, it seems logical that the software
development follow this pattern. Using this incremental methodology means that the development workload is split up into smaller sections, each with their own functionality that can be focused on one at a time. Using separate increments is then not only advantageous in that each area can be split up based on their functionality, but it is also advantageous as earlier increments can be used as prototypes, providing inspiration and guidance for the later increments in terms of how they look, their style and how they operate. In the design, requirements can be obtained based on what such a directory service system ‘should do’ and also based mainly around the classification schemes for each area sequentially done in the analysis in Chapter 4. These requirements can then be designated to separate increments and the design of each increment (and therefore the whole system architecture) can be completed. Following this, the implementation will follow the stages of incremental development with each increment fully implemented
14
and tested leading to the final system. Figure 3.1 shows the adapted incremental methodology for this project.Figure 3.1 – Incremental Methodology used. Adapted from [22].
3.5 Choosing Project Technologies
The software development part of this project involves the design and implementation of a Cloud Computing directory service for the selection of the suitable vendor(s) and service(s) to meet a user's requirements. The optimal way to produce this would be to incorporate the classification scheme for each area into a database and develop a Graphical User Interface (GUI) application that allows a user to specify their requirements and then uses the database to provide the most suitable service for the user’s requirements. With that in mind, it is important to select the best-suited database and development language for this project and developer
3.5.1 Development Language
The application built in this project is a directory service which will include the use of a GUI and database connectivity. It is wise, then, to make a choice of development programming language based on these factors. A lot of the time spent in this project will be done programming and creating the directory service based on the classification schemes developed in Chapter 4 and so it is crucial that the most appropriate development language is employed.
3.5.1.1 Development Languages Considered
Python
Python [23] is a relatively high-level object oriented programming language that is multi-purpose and used to build many types of application. Python is a very popular language and is widely used from small user-built applications to very large business-scale applications. Python would be appropriate for
15
this task because it enables GUI built applications through modules such as ‘TKinter’ and ‘WxPython’ [24] which are easily integrated into a Python application. Python also offers rich database connectivity functionality with mxODBC (open database connectivity). Python seems a reasonable choice for this project due to its extensive documentation to aid development, GUI functionality and offer of database integration.C++
C++ [25] is a middle-level object oriented programming language that is also multi-purpose and is widely used to implement efficient applications. Programs built in C++ are often quite short in
comparison to other languages due to its brevity which would save time for the developer. C++ has some GUI functionality with application frameworks from ‘Qt’ such as ‘Qt3’ [26]. C++ also offers database functionality which would be necessary in this project. However, many [27] believe that C++ is “not very fit for GUI programming”.
Java
Java [28] is an object oriented programming language that is famous for its rich, broad functionality and portability. Java supports GUI application development with its extensive Swing framework and supports database connectivity with the JDBC (Java Database Connectivity). Java is a practical choice for this project because of its fitting functionality, extensive documentation and the fact that the developer has been taught and has used Java over the past three years and also has experience with Swing GUI development and incorporating the JDBC into applications.
3.5.1.2 Development Language Conclusion
After examining the development languages described above, Java is the obvious choice for implementing the directory service in this project. Based on the advantages discussed above, most notably that the developer’s experience in Java is vastly more than that in C++ and Python, it is the ideal language to be used to solve this problem. Due to the limited timescale of this project, learning and developing new skills in an unfamiliar development language would not seem well-judged. Also, there would be no advantage to using any of the other languages and so Java will be used in this project.
3.5.2 Database Technology
There are many database technologies available for use, each of which is tailored towards a different kind of solution. However, these database technologies are tailored differently so their functionality can be exploited for fairly large scale tasks in business and other areas. Most database technologies will support the fundamentals and integration with a programming platforms and applications. As the database implemented in this project will not be too complex, most database
16
technologies would effectively serve their purpose for this project. However, it is important to select one database technology which will support all necessary tasks for this project and be easily integrated with any application built.3.5.2.1 Database Technologies Considered
MySQL
MySQL [23] is an open-source relational database technology that is widely used for many types of application, from very simple to complex applications in business. It is readily available and supports integration with many programming languages and applications. Although MySQL is written in the C and C++ programming languages, it does offer rich functionality with Java and the JDBC through the use of a Connecter/J driver [24]. MySQL would be a reasonable choice for this project due to its ease of
integration with the chosen programming language, reliability and functionality. PostgreSQL
PostgreSQL [25] is another open-source relational database technology which would be fit for purpose in this project. Like with MySQL, it offers compatibility with Java and the JDBC through a PostgreSQL JDBC driver. PostgreSQL supports all fundamental data types that would be needed in this application and would also be a reasonable choice for this project.
Apache Derby
Apache Derby is an “open source relational database implemented entirely in Java” [26]. It is based on Java, JDBC and SQL standards which will all be made use of in this project. Apache Derby’s integration with the application built would be straight forward as there is a JDBC driver that supports interactions between Java programs and an Apache Derby database. It is currently distributed by Sun as Java DB with the focus on ease of integration with Java applications. That being said, Apache Derby becomes a straight forward, reasonable choice for use in this project. Perhaps a defining advantage of utilising this database technology is that the developer has prior experience with using it.
3.5.2.2 – Database Technology Conclusion
The Apache Derby relational database management system will be used for this project. Although the other technologies examined could also have been successfully implemented it was decided this was the most suitable choice due to the fact that the developer has previous experience with it and no learning curve with regard to use of the database technology or integration with the Java application is required. Due to the limited time that is available to the developer in this project, selecting a familiar technology is advisable.
17
Chapter 4 – Analysis and Classification
4.1 Introduction
This chapter is concerned with the analysis of Cloud Computing services and what these services provide leading to the outlining and structuring of a classification scheme that focuses upon each of the three distinct areas of Cloud Computing. An inherent initial classification is that a Cloud Computing service will reside in one of the three areas. A service will either be classified as Software-as-a-Service (SaaS), Platform-as-a-Service (PaaS) or Infrastructure-as-a-Service (IaaS). Consequently, further analysis and hence further classifications are split into three separate sections for the three different areas.
When conducting this investigation and thorough analysis of services, it is important for the developer to be as objective as possible. Classifications should be based on facts and features derived from studying the services themselves and not the developer’s opinion or presumption about a service. The analysis and classification process is made difficult by the fact that because Cloud Computing is such a new field, there are very few journals, books or resources of information that provide helpful insight into possible classification schemes. This means that the process of analysis has to be focused largely on the actual official websites of the Cloud Computing services themselves which can often contain
propaganda and persuasion on what the advantages are, and what is offered by the service, as the websites themselves are sales tools as well as bases of relevant information.
The purpose of conducting this analysis is to derive a classification scheme for each area of Cloud Computing. The purpose of the classification scheme is to act as a basis on which the implemented directory service can provide a user with options for requirements and suggest suitable services based on these requirements. Because of this, the classification has to be focused on the commonalities and differences between services within each area so that services can ultimately be differentiated between in some way.
This chapter will show the process by which relevant services are identified and how they are analysed; what the key features, differences and commonalities are between services in each area, with justifications; and with examples, the classification schemes in each area derived from this analysis. The requirements of each of the classification schemes are clearly to provide suitable criteria for services within each area of Cloud Computing, containing key common features and differences between them. These classifications will then form the basis of the database used in the next part of the solution, the directory service.
18
In order to conduct an analysis of the websites of the Cloud Computing services it is necessary to first find the relevant information that is to be analysed. For SaaS there are several website-basedrepositories of services, where vendors of a service can place a link to the website of their service so that prospective users can discover them. [16], [17] and [32] are examples of these. However, these sites tend not to contain the well-known vendors of widely used SaaS offerings. To ensure these are included, a search strategy of finding websites, articles or reports about SaaS and providers is employed. Although there are not many of these available, articles such as [33] outline some examples of well-known
providers such as Google and Salesforce. A similar strategy was used to find PaaS and IaaS providers. An online collection of services can also be found at [34], helping to identify more providers to be analysed. Explicit searching through search engines and use of online encyclopaedias aided the identification of extra providers and services, this was particularly useful for finding PaaS offerings as this is the newest area of Cloud Computing and there is very little relevant information available.
4.3 Software-as-a-Service Analysis
For the first part of the analysis, Cloud Computing services which are classified as SaaS are examined. This section is concerned with the analysis of services that come under this category.
Upon starting the analysis of SaaS providers and offerings it is logical to begin the analysis with some of the most reputable vendors in the area. In finding resources for this analysis, it became
immediately obvious that Google [11] and Salesforce [10] offer important services and are pioneers in the area. When analysing providers such as these, the first feature of SaaS to be noticed is that different vendors offer different types of software. Using the examples from above, Google offer Google Apps
which offer functionality akin with traditional office suites with word processing and email capabilities amongst others. Whereas, Salesforce’s SaaS offering is Customer Relationship Management (CRM) software for use in business. Even within Google there is another SaaS offering, Google Docs [35], which is offers collaborative office tools and storage. The first key feature then within SaaS is what type of software is offered by the provider. When looking at further information and websites from ‘smaller’ companies providing SaaS, it is obvious that there are many different types of SaaS offerings provided in
Cloud Computing.
With these different types of SaaS offering, it is clear in the analysis that particular services are
suitable for and aimed at a particular type of user. Most types of SaaS found are aimed at business users, with types of software ranging from CRM to Human Resources Management (HRM) and software for e-Commerce. There are a variety of offerings designed for home-use for users not in business but the demand for SaaS in business is currently much higher where there are clear advantages of faster
19
the ability to access and ‘do business’ from anywhere with an internet connection. Consequently, another key feature for SaaS is whom the service is suitable for. Initially, an obvious distinction is that one service may be aimed at business and another may be aimed at home-users. However, it is not as clear-cut as that. It is noticed in the analysis that SaaS vendors often provide more than one offering of the same type ofbusiness software depending on the business. For instance, Salesforce [10] offer three separate ‘editions’ of their CRM SaaS, each suitable for a different size of business. The key feature then, accordingly, is not merely whether the offering is aimed at business or home-use but it also includes what sized business the service is aimed at. Salesforce’s three offerings are aimed at either small businesses; medium-sized more complex businesses or; large enterprise businesses. This distinction of either being aimed at a small business, medium-sized business, large business or being aimed at home users is a distinction that can be applied encompassing all services investigated in the SaaS analysis.
Another feature of SaaS offerings is whether or not they are priced for their use, how they are priced and importantly how much is charged for the use of the service. The analysis of services yielded the fact that some services charge for use and some (very few) don’t. For example, Google Docs is free to use whereas Salesforce and Google Apps amongst many others demand a price for use of their software. Within the SaaS offerings that are charged for, there are two main models of pricing that are evident throughout the analysis. Some offerings are charged by how many users in a particular organisation use the service per month (per user per month) or there is a much higher charge for the entire company per month (per company per month). There were also a very small number of vendors that charged for the entire company per year but this was often when the price for use of the service was very low.
Resultantly, the pricing model is a key feature that can differentiate between services. Another feature is the actual price of the service within the pricing model. However, not all vendors are up front with their pricing information and request that you obtain a quote. Where pricing information was available, prices tend to be particularly low when they are charged per user per month, by the nature of the pricing model, and a lot higher when charged per company per month or per year. This is an obvious important feature to differentiate between services, particularly of the same type.
Many services also offer a Free Trial to the user or business. This is a period of time when the service can be used without charge to ‘trial’, with or without obligation, before a particular user or business decides to commit to paying for the service. For offerings which do not charge, this feature is clearly not applicable. Free Trials vary in length, the most common being 30 days offered by services such as SugarCRM [36] and Google Apps, with some vendors such as Accounting SaaS eFinancials [37] offering trials as long as 6 months. Consequently, whether or not a service offers a free trial is another feature that can be used to discriminate between services.
20
An important feature of some SaaS providers and a key characteristic of Cloud Computing in general is whether or not the provider offers a Service Level Agreement (SLA). With SaaS, SLAs are important because users can be paying for a service, and relying on a remote service ‘in the Cloud’ to do some task or even entire business processes. One concern over SaaS is that it is “vulnerable to vendor closure” [38]. Particularly for businesses migrating entire business processes into the Cloud with SaaS, SLAs are very important. SLAs can specify the ‘promised’ or ‘guaranteed’ uptime, latency (response times) and packet loss (data is lost through network connection to SaaS provider) [39] amongst other metrics, as well as the financial (or otherwise) guarantee of refund or return if this SLA is not met. The SLA is an agreement between user and SaaS provider which entitles the user to compensation, financial or otherwise, if a particular service level agreed prior is not met. Some SaaS vendors do not currently provide SLAs. SLAs are a very influential feature of SaaS offerings and not only should the ‘presence’ of an SLA be included in the classification but the details of the particular SLA, which will differ from vendor to vendor, should also be included.There are many other pertinent features of SaaS offerings that although important, they may occur very infrequently and would then not be viable for inclusion in a classification. There is also information relating to some of the features, such as the length of a free trial and whether by signing up for a free trial a user is obligated to pay for the service after the time is up. Other information can include the minimum number of users that can be signed up for a ‘per user per month’ payment SaaS, and whether or not technical support is available at all times. Information also may be specific to the particular type of software. All of this information can’t easily be used to differentiate between services but however is important information about a service. With the ultimate goal of the classification scheme
to form the basis of a database for a directory service, additional pertinent information such as this should be included. Such information should be categorised as ‘additional information’ and will form a part of the classification.
4.3.1 Software-as-a-Service Classification
From the features specified and explained in section 4.3, the following classification scheme is proposed for SaaS offerings:
• Type of Software – there are many types, ranging from business Supply Chain Management to Office Suites for home-users.
• Whom the Software is suitable for – whether that be a home-user or a small, medium or large sized business.
• Pricing Model – most commonly ‘per user per month’ or ‘per company per month’
21
• Presence of SLA – whether or not the provider offers a Service Level Agreement.• SLA Details – the details of a particular Service Level Agreement
• Presence of a Free Trial – whether or not the provider offers a Free Trial period for use of the SaaS.
• Additional Information – any other pertinent information regarding the particular SaaS offering.
4.3.2 Classification Example
Figure 4.1 in Appendix C shows some examples of how a small number of the various SaaS offerings analysed fit into the classification scheme and how they would be differentiated between.
4.4 Platform-as-a-service Analysis
For the next part of the analysis, Cloud Computing services which are classified as Platform-as-a-Service (PaaS) are examined.
Analysing PaaS services is more difficult than SaaS in the first part of the analysis. PaaS is the newest area of Cloud Computing and is also possibly the most diverse. A classification scheme for PaaS is more difficult to define because there are many fewer providers that are concerned with this service. Furthermore, many websites of PaaS offerings contain little up-front information and many are also still in development or ‘Beta’ stages. For example, the developer found no examples of Service Level Agreements in the PaaS offerings investigated, although some promise that they are forthcoming as they develop their PaaS. However, it was still possible to create a solid classification scheme based on an analysis presented in this section.
The first feature noticed when examining information was that a Platform was designed to support the developing and deploying one of two certain types of application. The two types of
application that can be built are general web applications or business applications. Most PaaS providers such as Google App Engine [45] support web applications of many types and purposes but some providers for example Force [14] specialise in supporting the development of complex business applications. The business applications are web applications, but specifically provide functionality for businesses. Therefore a key feature of PaaS offerings is what kind of application they allow the user to
develop and deploy.
When analysing certain websites which claim to offer Platform-as-a-Service, it can be noticed that some offerings allow the user to develop the application on the vendor’s servers and infrastructure as well as eventually deploying the application. Whereas, with some PaaS offerings, such as GigaSpaces [46] only allow the user to develop and test their application on their own machine before deploying the service to the vendor’s infrastructure. It is important to make the distinction between services on their
22
they only ‘partially’ support the process, which is the case with GigaSpaces. As a consequence of this, thelevel of PaaS support is another key feature which can help differentiate between providers.
As PaaS is concerned with the developing of applications, a user of PaaS would need to know what languages and development techniques they can use to develop their application. Different PaaS offerings allow a user to develop in various languages and, as this is an important concern for a user of a PaaS and may be a deciding factor of using one Platform over another, the development languages and techniques offered by a PaaS is a significant feature of a particular provider. Google App Engine only currently allows users to develop applications in the Python language, whereas with the Microsoft Azure Platform, users can develop applications in Java, Ruby and PHP [47] amongst others. Some PaaS providers specify that developers have to use the vendor’s own development languages. For example, Force [14] support development in their own languages – Apex and Visualforce.
Another feature which would be important to the user of a PaaS is the protocols by which the Platform allows your application to work. Users developing an application may want to specify that their application built was to work under the Representational State Transfer (REST) protocol or Simple Object Access Protocol (SOAP) for addressing or exchanging information, or whether the web application should be accessed through standard HTTP requests. Some PaaS offerings offer users developing an application the use of REST and SOAP through APIs (Application Programming Interface), such as Rollbase [48]. As this would be of great importance to a prospective user, it is consequently to be seen as a key feature of PaaS offerings.
As with SaaS, the user is often charged for use of the service. However, with PaaS, many offerings are still in an early stage and any detailed pricing information is not available. Some PaaS providers do specify pricing information but between these services, the models for pricing are very different and there isn’t a standardised way of pricing across the PaaS market. A PaaS called LongJump [49], for example, specify their pricing is done depending on how many users use the built application in a month. Whereas with Wolf Frameworks [50], pricing is done based on consumption, with the
bandwidth and storage the deployed application uses specifying the price. Other offerings such as Appjet [51] are entirely free for both developing and hosting. As there is no one model to specify pricing by and as prices themselves are often only obtainable via a quote or after an application has been deployed, it seems reasonable to include pricing information as a separate category in the classification. Although this may not be used to differentiate between services explicitly, general pricing information is of great enough importance to be included as a feature of PaaS services.
Many services also offer a ‘Free Trial’ to the user of varying lengths. This is a period of time when the user can develop and deploy their application for free where they usually would have to pay a fee for doing so. Like with SaaS, users that want to try a service before they are committed to it with
23
payment would desire a provider to offer a free trial and because of this, the presence or not of a free trial is a feature that can be used to differentiate between PaaS offerings.Finally, any other appropriate information specific to a vendor offering a PaaS that would be useful to include will be specified, like with SaaS, under an ‘additional information’ category. This category will be useful in the directory service to be produced based on the classification scheme.
4.4.1 Platform-as-a-Service Classification
From the features specified and explained in section 4.4, the following classification scheme is proposed for PaaS offerings:
• Type of Application Supported – this will either be a Web Application, Business Application or both.
• Level of Support – whether development and deployment are done on vendor’s server. • Development Languages – the languages and techniques supported by the Platform for
development. These are wide ranging.
• Protocol Support – the Protocols by which the user can specify the application to function -SOAP, REST or simply through HTML.
• Pricing Information – information related to the pricing of the PaaS.
• Presence of a Free Trial – whether or not the provider offers a Free Trial period for use of the PaaS.
• Additional Information – any other pertinent information regarding the particular PaaS offering.
4.4.2 Classification Example
Figure 4.2 in Appendix D shows some examples of how a small number of the various PaaS offerings analysed fit into the classification scheme and how they would be differentiated between.
4.5 Infrastructure-as-a-service Analysis
For the final part of the analysis, Infrastructure-as-a-Service (IaaS) providers are examined and presented here.
The first feature that is noticed of IaaS providers is that most providers offer services of varying degree to fit a user’s requirements. A vendor typically offers at least four different configurations of Infrastructure for a user. What is immediately noticeable is that the different configurations all offer varying capabilities on their infrastructure in three main areas: the amount of CPU power (in GHz), the amount of Memory (in GB of RAM) and the amount of Storage (in GB). Different offerings within a particular IaaS vendor provide varied proportions of the three above. For example, Amazon EC2 [12]
24
provides a ‘Small’ offering, specifying that a user would obtain a CPU of 1.1 GHz, 1.7 GB RAM and 160GB Storage, and amongst others an ‘Extra Large’ offering, specifying a CPU of 8.8 GHz, 15 GB RAM and 1690 GB Storage. These differences are not just intrinsic to vendors, different IaaS vendors offer varying packages, each with different configurations of CPU power, Memory and Storage. This is clearly a way to differentiate between services and is a key feature in the classification of IaaS.As a user is obtaining infrastructure and utilising it as they would their own on-site infrastructure, there is a necessity to run a particular Operating System (OS) on this infrastructure. Different IaaS vendors allow for differing Operating Systems but typically providers can be categorised as offering a Windows OS, a Linux OS or providing a choice of either. Some services for example Joyent [52] offer a different OS such as Open Solaris but typically a feature of IaaS is that a Windows or Linux, or both, distributions of Operating Systems are offered. It would also be important for a user to know the specific distributions of these Operating Systems that can be utilised in an IaaS offering, for instance ‘Windows Server 2008’ or ‘Ubuntu’ [12] and this is another key feature that can be used to differentiate between IaaS providers and offerings.
To speed up the process of performing tasks or hosting applications and to enhance the ease of use of their infrastructure, vendors often lease virtual servers to users with some software pre-installed and ready to use and interact with immediately. Typically the software is based around typical
functionality of a user of on-demand infrastructure such as software for development languages and software frameworks, and database functionality. Some services such as FlexiScale [53] do not offer the preinstalled software capabilities and instead the user has to import virtual images of this software to make use of it. Clearly the presence of certain software that can be readily used on the infrastructure is a feature which a user would be concerned with and therefore this is a key feature of IaaS offerings.
When a user is ‘renting’ Virtual Machines (VMs) from an IaaS provider, the user needs a way to program and pass commands to this VM. An Application Programming Interface (API) is used to do this. There are certain protocols by which this API can be passed messages and be operated. The two main protocols are REST and SOAP, as they are in Section 4.6 and the PaaS analysis. Users of an IaaS may have a preference as to which protocol they wish their VM to make use of (as identified in [54] as 85% of Amazon use is done by the REST protocol, where SOAP is also available) and this is a way to distinguish between different providers. Also, VM APIs are programmed and interacted in particular programming languages supported by the IaaS provider. Most IaaS vendors provide a Java based API but different services offer different lang