3.4 Experiments and Results
4.2.2 Volunteer Systems and Cloud Computing
The previous work outlined has highlighted the research and implementation chal-lenges surrounding ad hoc cloud computing as well as initial results showing the promise of such a paradigm.
Andrezejak et al. explore the idea of allowing a web-service provider, for example DropBox, to use resources from the non-dedicated hosts they serve [54]. This aims to reduce the number of dedicated servers a web-service provider owns or provisions from cloud services, in turn reducing costs. The authors restrict the use of their proposed approach to the web service domain, where non-dedicated hosts are primarily used for their processing capabilities. This is due to the limited bandwidth and sporadic availability of non-dedicated hosts.
Andrezejak et al. identify that the availability of non-dedicated resources is a pri-mary problem, however they assume that a web service has fault tolerance and re-dundancy mechanisms in place to cope with highly volatile non-dedicated hosts; we develop and present our solution to this problem later in this chapter. Those authors focus on a number of other research challenges.
Firstly Andrezejak et al. review techniques for predicting short-term availabil-ity of non-dedicated hosts in order to help predict their long-term availabilavailabil-ity. Sec-ondly, they outline a method to identify optimal combinations of dedicated and non-dedicated hosts to reduce costs or reduce the number of migrations performed when non-dedicated hosts fail. Processes previously running on a failed non-dedicated host are restored by migrating the process’s data to another non-dedicated host. Unlike our ad hoc cloud platform, this is performed periodically and not when the failure occurs;
4.2. Related Work 81
the details of this migration and restoration process are not described. Finally, the au-thors investigate the trade-offs between using a greater number of dedicated hosts or non-dedicated hosts.
In order for a web-service provider to utilise resources from a non-dedicated host, the host’s availability is first calculated based on monitoring data collected over several weeks. Well known Machine Learning predictors such as Last Value, Na¨ıve Bayes and Gaussian predictions are used to group the hosts according to their predicted short-term availability; a non-dedicated host is however deemed to be available if 100% of its CPU is free, which in many cases will not occur.
Non-dedicated hosts in a group with the lowest rank (i.e. those that are likely to be available) are used first by the web-service provider before lower groups are ex-ploited in descending order; the authors consider the case where non-dedicated hosts are ranked in groups ranked from 1 to 4. By combining simple prediction mechanisms with host ranking, Andrezejak et al. claim this allows accurate long-term predictions to be made. Their results show that their average highest and lowest error rates of long-term availability prediction are approximately 21% and 14% respectively. Fur-thermore, as the number of non-dedicated hosts increases, the probability of meeting availability guarantees decreases, and vice versa.
Also by increasing the level of data redundancy, the number of dedicated hosts required to meet availability guarantees decreases. Our implementation of the ad hoc cloud does not utilise redundancy but instead takes a reactive approach to deal with non-dedicated host failures. We assume that by incorporating task redundancy into the ad hoc cloud, the success rate of task completion will increase further.
Andrezejak et al. then propose an optimisation method to either reduce costs for the web-service provider or reduce the number of migrations performed. The authors assume a web-service provider’s dedicated resources are served from a cloud provider such as Amazon EC2 and therefore incur costs of 10 US cents per hour for each ded-icated host; this is comparable to a particular Amazon EC2 instance. Similarly data transferred between non-dedicated and dedicated hosts is charged at a rate of 10 US cents per GB.
Andrezejak et al. find that for a group of rank 1 non-dedicated hosts, the optimal number of dedicated hosts required to meet availability guarantees while also reducing costs is 25 dedicated hosts from a set of 55. For a group of rank 4 non-dedicated hosts, the total number of hosts a web service must use must increase to 62 in order to keep costs as low as possible. By only considering rank 1 non-dedicated hosts, 44 dedicated
hosts from a possible set of 52 are required to minimise the number of migrations.
In the case of rank 4 non-dedicated hosts, the migration rate is lowest when more dedicated hosts are used, in particular, 49 dedicated hosts from a set of 52.
As our concept of the ad hoc cloud only involves one dedicated host and a po-tentially unlimited number of non-dedicated volunteer hosts, Andrezejak et al. show that volunteer hosts have a great potential to perform tasks that are typically executed on dedicated hosts. Those authors’ work presents an approach that complements our own, particularly regarding the calculation of a host’s short and long-term availability as well as reducing the number of migrations between volunteer hosts. We detail our solutions to these problems as well as a method to handle volunteer host failures later in this chapter.
Mori et al. discuss their sophisticated ad hoc cloud computing environment, named SpACCE, that is tailored for application sharing and distributed collaboration [170].
Their idea is based on creating a cloud environment by offering services from an ad hoc server, called CollaboTray, that may at any time, migrate to another node in the network. An example service outlined is Microsoft Powerpoint. The server may mi-grate if the node currently hosting the server has an increase in utilization or will reduce the performance of the service delivered to the clients. If an application requires more capacity to execute effectively, other clients can be converted into servers to avoid the total server resource capacity from diminishing.
Due to the ad hoc nature of their project, our goals are similar; namely how to effectively co-exist with user processes, deal with dynamic hosts and the migration of components between hosts. Their results show that large performance latencies can occur if the server does not have 40% of the CPU available to use. This means that applications that are resource intensive will be unable to utilise CollaboTray. In order to migrate CollaboTray, it is first closed, its state is then transferred to another node and finally it is restarted; a similar process we use to migrate and restore virtual machines between hosts.
However CollaboTray does not use virtualization, hence the security of the system is questionable if the server is migrated to an untrustworthy node. There is also no concept of host reliability which will result in poor application performance if the server is migrated to an unreliable node. Our implementation of the ad hoc cloud provides solutions to the downfalls mentioned as well as additional features such as effective monitoring and scheduling.
4.2. Related Work 83
Cunsolo et al. argue that cloud computing is a computational model directed to-wards businesses, therefore restricting its usefulness for scientific purposes [85]. Those authors propose an alternative to the data centre model where individual users are able to donate their resources to form a unified cloud. As this is similar to volunteer com-puting, they name this Cloud@Home. By merging volunteer and cloud comcom-puting, users may either offer their resources for free to an OpenCloud or buy and sell re-sources from a cloud called HybridCloud. These two cloud models are then able to exist independently, link with one another, or link to other public and private cloud computing platforms.
In their proposal, the authors identify that resource management, security, relia-bility and Quality of Service (QoS) are some of they key challenges to overcome.
Resources are managed centrally and security is provided by virtualization, data en-cryption and secure transmission protocols. Reliability is however based on negotia-tions with volunteer users specifying their contribution; a volunteer host could however leave at any time and no mechanisms for recovery are proposed. In turn, no QoS guar-antees could be made. Furthermore, the authors do not specify, among other things, the volunteer system to be used and how this could be transformed into a cloud platform.
Wu et al. create a private cloud based on BOINC for the purpose of executing parallel and distributed simulation tasks [221]. Much of this focus is on scheduling tasks to nodes within the system by using BOINC as a dispatcher according to the authors own load-balancing algorithms. Although no reference is made to how their architecture is in fact a cloud or how BOINC is part of their architecture, the authors do note that scheduling and infrastructure monitoring are important components within private clouds.