Virtualisation - Scalable and responsive real time event processing using cloud computing

The core objective in virtualisation is to provide performance isolation, scheduling priority, memory demand, network access and disk access for all the concurrently running resources spawned in a single piece of hardware. The enabling factor for

cloud computing is the virtualization techniques provided by hypervisors such as XEN [Barham et al.], VM Ware ESXI, Hyper-V and KVM. The hypervisors allow multiple operating systems and applications to run simultaneously on the hardware. To offer linear scalability to the applications, VMware provides the following tasks:

• Resource utilization • Scalability

• High availability

9.2.1 Scalability

Technologies to manage scalability in cloud computing as a part of a software defined environment (SDE) is an emerging area of research. Resource-aware and workload management solutions enable dynamic allocation of virtual machine infrastructure in accordance to the incoming event streams or proportionately adapt to the arrival rate of the streams. Design of automated resource management can be built on the reconfiguration algorithms in section §7.2 focusing on major issues such as

• Cost of the cloud computing utilization • Policy driven resource scheduling

9.2.1.1 Cost of cloud computing utilization

Spot instances are one of the emerging popular services in cloud computing. Spot instances are the excess computing nodes or instances of the elastic compute cloud facility, which are available in real time through bidding. Innovative applications can be built based on the reconfiguration algorithms. At regular intervals, an application should register internal demand for computing resources and search for the cheapest available compute resource in real time. Increasing the percentage of application execution in the cheapest available computing node would drastically reduce the ex- pense of the event processing. Quantitative research projects that require hundreds of terabytes of data need many hours of compute power. Highly prevalent options are to use on-demand instances in cloud computing. However, to utilize spot instances, custom-built algorithms to optimally bid and consume cost-effective resources are es- sential. The extension of cost centric reconfiguration algorithms can be utilized in

the batch processing scenarios as well to map multi-hour jobs in to smaller sub-jobs, queue them based on the memory and I/O requirements, monitor the spot price of the compute resource and run the outstanding jobs in the most cost effective way using cloud computing. Cloud computing has enabled shrinking of the multiple years’ work to a few hours, however, the automated hire of the spot instances could drastically reduce the cost of the resource utilization.

9.2.1.2 Policy driven scheduling

Automated resource management should choose the appropriate servers on which to place the event processing networks based on the policies designed by the application. For example, in an indoor localization of the elderly people inside a care home, the personal data about the individual sensors used by the elderly people reside in a private server. The encrypted private information on the health conditions or the identity is abstracted within the private access residing in a private cloud. The move- ment or history of the activity logs of elderly residents is a stream of events from multiple sensors, which can be processed in a public cloud. The criteria on the utilization frequency for each virtual machine in terms of the context of events, privacy requirements, cost, security criteria and the data management policy could act as a decision-making criteria for the management of the resources in the automated man- ner. For example, the computing nodes can be made available to hire as determined by the algorithm’s security and privacy policy. The configuration scheduler in Section §7.2 can be extended by incorporating additional features related to the security and privacy of the underlying applications.

In an automated scenario, the deployments may require more than one environment such as private cloud, public cloud or both. The configuration scheduler can be defined with a directory of event processing channels where multiple networks can be inter- connected. A network interface can be provided to connect with individual computing nodes in the system. In example 9.1, machine A and B can communicate with each other, machines C and D can communicate with each other but machine A cannot communicate with machines C or D unless it goes through B which has an interface in both networks. The Network X.X.X.0 could be a private network operating internally within an enterprise and the Network Y.Y.Y could be a group of computing nodes operating in the public cloud. The configuration scheduler in this case maintains the directory of available computing nodes in both the private and the public cloud. During the scaling up of the network, the highest preference can be given to the computing nodes available in the same network and security policy. This can help

Figure 9.1: Event Processing Channels of communication

the system to utilize the computing nodes optimally within the internal environment and once sufficient resources are unavailable, external hiring could be done based on a pay-as-you-go basis. As part of creating the network of computing nodes, multiple virtual networks can be created based on the requirements. All subsequent networks can be set up dynamically based on the incoming event stream traffic. As a part of automation, live migration of the virtual machines or images of EPNs hosted in the virtual machines can be ported as in Clark et al. (2005). Live migration with the automated reconfiguration could reduce the service time and improve the overall system performance.

9.3 Resource Utilisation

The applications running inside one virtual machine are expected to be independent of the co-located applications running inside another virtual machine. The performance isolation [Koh et al. (2007), Nathuji et al. (2010)] and CPU fair sharing [Kazempour et al. (2010), Liu et al. (2010)] in a virtualized environment address the resource contention issues. However, the real time streaming events complexities in terms of performance isolation and resource contention in terms of dynamism exhibited by the system in terms of event producers, event consumers, state management and context. The assumption of the performance isolation in the virtualization fails under high I/O requests and heavy CPU intensive jobs. The resource utilisation policies can be extended targeting specifically towards each hypervisor and constraints associated

with it. This will lead to some generic policy formulation to handle the resource utilisation.

In document Scalable and responsive real time event processing using cloud computing (Page 149-153)