• No results found

Analysis of Scheduling based Cloud Computing

N/A
N/A
Protected

Academic year: 2021

Share "Analysis of Scheduling based Cloud Computing"

Copied!
6
0
0

Loading.... (view fulltext now)

Full text

(1)

Abstract— Cloud computing is the latest distributed computing paradigm and it offers tremendous opportunities to solve large-scale systematic problems. However, it presents various challenges that need to be addressed in order to be efficiently utilized for workflow applications. even though the workflow scheduling problem has been widely studied, there are very few initiatives customized for cloud environments. One of the fundamental issues in this environment is related to task scheduling. In cloud computing, traditional way for task scheduling cannot measure the cost of cloud resources accurately by reason that each of the tasks on cloud systems is totally different between each other. Cloud task scheduling is an NP-Hard optimization difficulty, and many meta-heuristic algorithms have been proposed to solve it. A good task scheduler should get used to its scheduling strategy to the changing environment and the types of tasks.

Keywords—Cloud SIM , Scheduling ,Development Model I. INTRODUCTION

In recent 15 years, Internet has been developing very quickly. The cost of storage, the power consumed by computer and hardware is increasing. The storage space in data center can’t meet our needs and the system and service of original internet can’t solve above questions, so we need new solutions. At the same time, large enterprises have to study data source fully to support its business. The collection and analysis must be built on a new platform. With the popularity of the Internet, a lot of business applications based on Internet are widely used while the scientific computing has just started. As learned from past events, computing in its purest form has changed hands multiple times. Most data are stored on local networks with servers that may be clustered and share storages. This approach has enough time to be developed into a stable architecture, and provides decent redundancy if it’s deployed in a right way.

Cloud computing is the next natural step in the evolution of on-demand information technology services[4] and products. To a large extent cloud computing is based on virtualized resources.

The idea of cloud computing is based on a very fundamental principal of `reusability of IT capabilities`. The difference that cloud computing brings compared to traditional concepts of grid computing, distributed computing, utility computing, or autonomic computing is to broaden horizons across organizational boundaries.

"A paradigm in which information is permanently stored in servers on the Internet and cached temporarily on clients that include desktops, Entertainment centers, table computers, notebooks, wall computers, handhelds, etc."

In theory, cloud computing promises availability of all required hardware, software, platform, applications, infrastructure and storage with an ownership of just an internet connection. Cloud computing is an emerging paradigm in the computer industry where the computing is moved to a cloud of computers. It has become one of the buzz words of the industry. The core concept of cloud computing is, quite simply, that the vast computing

Fig. 1

resources that we need will reside somewhere out there in the cloud of computers and we’ll connect to them and use them as and when needed.

\

Computing can be described as any activity of using and/or developing computer hardware and software. It includes everything that sits in the bottom layer, i.e. everything from raw compute power to storage capabilities. Cloud computing ties together all these entities and delivers them as a single

Netrika

#1

,

Sheo Kumar

*2

#PG Scholar, Department of Computer Science & Engineering, SDDIET, Haryana, India *

Assistant Professor and Head, Department of Computer Science & Engineering, SDDIET, Haryana, India

Analysis of Scheduling based Cloud Computing

(2)

integrated entity under its own sophisticated management. Cloud is a term used as a metaphor for the wide area networks (like internet) or any such large networked environment. It came partly from the cloud-like symbol used to represent the complexities of the networks in the schematic diagrams. It represents all the complexities of the network which may include everything from cables, routers, servers, data centers and all such other devices. People can access the information that they need from any device with an Internet connection including mobile and handheld phones rather than being chained to the desktop. It also means lower costs, since there is no need to install software or hardware.

II. RELATED WORK

Chun-Wei Tsai et al presents a novel heuristic scheduling

algorithm, called hyper-heuristic scheduling algorithm (HHSA), to find better scheduling solutions for cloud computing systems. The diversity detection and improvement detection operators are employed by the proposed algorithm to dynamically determine which low-level heuristic is to be used in finding better candidate solutions. To evaluate the performance of the proposed method, this study compares the proposed method with several state-of-the-art scheduling algorithms, by having all of them implemented on CloudSim (a simulator) and Hadoop (a real system). The results show that HHSA can significantly reduce the makespan of task scheduling compared with the other scheduling algorithms evaluated in this paper, on both CloudSim and Hadoop.

Luca Ferretti et al present cloud database paradigm is strictly

related to strong guarantees in terms of service availability, scalability and security, but also of data confidentiality. Any cloud provider assures the security and availability of its platform, while the implementation of scalable solutions to guarantee confidentiality of the information stored in cloud databases is an open problem left to the tenant. Existing solutions address some preliminary issues through SQL operations on encrypted data. Author propose the first complete architecture that combines data encryption, key management, authentication and authorization solutions, and that addresses the issues related to typical threat scenarios for cloud database services. Formal models describe the proposed solutions for enforcing access control and for guaranteeing confidentiality of data and metadata. Experimental evaluations based on standard benchmarks and real Internet scenarios show that the proposed architecture satisfies also scalability and performance requirements.

Tingting Wang et al presents Task scheduling is a fundamental

issue in cloud computing. According to the popular cloud

systems, the computing resources are usually connected by LAN(some use Infiniband). In virtualized cloud system, computing nodes are different kinds of ordinary PCs, servers, and even high performance clusters on which we will set up VMs. Among them, several VM splits a physical server.The basic unit of resource is the processing ability of a virtual machine. Generally speaking, the resource the VM owns is fixed and the No. of VMs which a physical node can deploy is certain. For example, one CPU core corresponds to a VM. Task scheduling aims at seeking out reasonable VMs for tasks.

Jianhua Gu et al presents a scheduling strategy on load

balancing of VM resources based on genetic algorithm. According to historical data and current state of the system and through genetic algorithm, this strategy computes ahead the influence it will have on the system after the deployment of the needed VM resources and then chooses the least-affective solution, through which it achieves the best load balancing and reduces or avoids dynamic migration. At the same time, the aut or brings in variation rate to describe the load variation of system virtual machines, and it also introduces average load distance to measure the overall load balancing effect of the algorithm.

III. DEPLOYMENT MODELS

1. Private cloud. The cloud infrastructure is provisioned for exclusive use by a single organization comprising multiple consumers (e.g., business units). It may be owned, managed, and operated by the organization, a third party, or some combination of them, and it may exist on or off premises.

2. Community cloud. The cloud infrastructure is provisioned for exclusive use by a specific community of consumers from organizations that have shared concerns (e.g., mission, security requirements, policy, and compliance considerations). It may be owned, managed, and operated by one or more of the organizations in the community, a third party, or some combination of them, and it may exist on or off premises.

3. Public cloud. The cloud infrastructure is provisioned for open use by the general public. It may be owned, managed, and operated by a business, academic, or government organization, or some combination of them. It exists on the premises of the cloud provider.

4. Hybrid cloud. The cloud infrastructure is a

composition of two or more distinct cloud infrastructures (private, community, or public) that

(3)

remain unique entities, but are bound together by standardized or proprietary technology that enables data and application portability).

5. Virtual Private Cloud - Also known as a "dedicated cloud" or "hosted cloud," this model results in a self-contained cloud environment hosted and managed by a public cloud provider, and made available to a cloud consumer.

6. Inter-Cloud - This model is based on an architecture comprised of two or more inter-connected clouds. 7. Combined cloud - Two clouds that have been joined

together are more correctly called a "combined cloud". A combined cloud environment consisting of multiple internal and/or external providers” will be typical for most enterprises". By integrating multiple cloud services users may be able to ease the transition to public cloud services while avoiding issues such as PCI compliance.

Fig. 2

IV. CLOUD SIMULATORS

1.CloudSim: CloudSim Is a new, generalized and extensible

simulation toolkit and application which enables seamless modeling, simulation, and experimentation of emerging cloud computing system, infrastructures and application environments for single and internetworked clouds. The Existing distributed system simulators were not applicable to the cloud computing environment due to evaluating the performance of cloud provisioning policies[6], services, application workload, models and resources under varying system, user configurations and requirements. To overcome this challenge, CloudSim can be used. In simple words, CloudSim is a development toolkit for simulation of Cloud scenarios. CloudSim is not a framework as it does not provide a ready to use environment for execution of a complete scenario with a specific input. Instead, users of CloudSim have to develop the Cloud scenario it wishes to

evaluate, define the required output, and provide the input parameters. CloudSim is invented as Cloud Bus Project at the University of Melbourne, Australia and supports system and behavior modeling of cloud system components such as data centers, virtual machines (VMs) and resource provisioning policies. It implements generic application provisioning techniques that can be extended with ease and limited efforts.

2.CDOSim : CDOSim is a cloud deployment option (CDO)

Simulator which can simulate the response times, SLA violations and costs of a CDO. A CDO is a decisions concerning simulator which takes decision about the selection of a cloud provider, specific runtime adaptation strategies, components deployment of virtual machine and its instances configuration. Component deployment to virtual machine instances includes the possibility of forming new components of already existing components. Virtual machine instance’s configuration, refer to the instance type of virtual machine instances. CDOSim can simulate cloud deployments of software systems that were reverse engineered to KDM models. CDOSim has ability to represent the user’s rather than the provider’s perspective. CDOSim is a simulator that allows the integration of fine-grained models.

3.TeachCloud : TeachCloud is a cloud simulator which is

specially made for education purposes. TeachCloud provides a simple graphical interface through which students and scholars can modify a cloud’s configuration and perform simple experiments.TeachCloud uses CloudSim as the basic design platform and introduces many new enhancements on top of it such as

1. Developing a GUI toolkit.

2. Adding the cloud workload generator to the CloudSim simulator.

3. Adding new modules related to SLA and BPM.

4. Adding new cloud network models such as VL2, BCube, Portland and DCell.

5. Introducing a monitoring outlet for most of the cloud system components.

Adding an action module that enables students to reconfigure the cloud system and study the impact of such changes on the total system performance

4.CanCloud : iCanCloud is a cloud simulator which is based on

SIMCAN. In simple words, iCanCloud is a software simulation framework for large storage networks. iCanCloud can predict the trade-off between costs and performance of a particular application in a specific hardware in order to inform the users about the costs involved. It focuses on policies which charge users in a pay-as-you-go manner. iCanCloud has a full graphical user interface from which experiments can be designed and run,

(4)

but existing software systems can only be modeled manually. It also allows parallel execution of one experiment over several machines.

5.SPECI :Simulation Program for Elastic Cloud Infrastructures

(SPECI) is a simulation tool which allows analyzing and exploration of scaling properties of large data center behavior under the size and design policy of the middleware as inputs. SPECI is a simulation tool which allows exploration of aspects of scaling as well as performance properties of future Data Centers. The aim of SPECI is to simulate the performance and behavior of data centers, given the size and middleware design policy as input. Discrete event simulations (DES) are a type of simulation where events are ordered in time maintained in a queue of events by the simulator and each processed at given simulation time. SPECI uses an existing package for DES in Java. SPECI is intended to give us insights into the expected performance of DCs when they are designed, and before they are built. The size of data centers that provide cloud computing services is increasing, and some middleware properties that manage these data centers will not scale linearly with the number of components. SPECI is composed of two packages: data center layout and topology, and the components for experiment execution and measuring.The experiment part of the simulator builds upon SimKit, which offers event scheduling as well as random distribution drawing.

6.GroudSim : GroudSim is an event based simulator that needs

one simulation thread for scientific applications on grid and cloud environments based on a scalable simulation independent discrete-event core. It is mainly concentrated on the IaaS, but it is easily extendable to support additional models such as PaaS, DaaS and TaaS. The user to simulate their experiments from the same environment used for real applications by integrating GroudSim into the ASKALON environment. GroudSim provides a comprehensive set of features for complex simulation scenarios such as simple job executions on leased computing resources, calculation of costs, and background load on resources. Simulations can be parameterized and are easily extendable by probability distribution packages for failures which normally occur in complex environments. Experimental results demonstrate the improved scalability of GroudSim compared to a related process-based approach.

7.DCSim : DataCenter Simulator is concentrated on virtualized data center which offers IaaS to Multiple tenants, in order to achieve a simulator to evaluate and develop data center management techniques. Data centers are becoming increasingly popular for the provisioning of computing resources. The cost and operational expenses of data centers have skyrocketed with the increase in computing capacity.

V. SCHEDULING IN CLOUD COMPUTING An essential requirement in cloud computing environment is scheduling the current jobs to be executed with the given constraints. Cloud Computing is also about how[5] IT is provisioned and used and not only about technological improvements and also the scheduling of data centers. The main target of scheduling is to maximize the resource utilization and minimize processing time of the tasks. The scheduler should order the jobs in a way where balance between improving the quality of services and at the same time maintaining the efficiency and fairness among the jobs[5]. An efficient job scheduling strategy must aim to yield less response time. so that the execution of submitted jobs takes place within a stipulated time and simultaneously there will be an occurrence of intime resource reallocation. As a result of this, jobs takes place and more number of jobs can be submitted to the cloud by the clients which ultimately results in accelerating the business performance of the cloud system.

The four job scheduling policies in Cloud computing, Random, Round Robin (RR), Minimum Completion Time and Opportunistic Load Balancing. These algorithms are considered the most common and frequently used algorithms for job scheduling in Cloud computing.

A. Random Resource Selection Algorithm

The idea of random algorithm is to randomly assign the selected jobs to the available Virtual Machines (VM). The algorithm does not take into considerations the position of the VM and it will either be under heavy load or low load. Then, this may result in the selection of a VM under heavy load and the job requires a long waiting time before service is obtained. The complexity of this algorithm is quite low as it does not need any overhead or preprocessing. Two input sets, cloudlets (i.e., jobs) and available VMs.

Index = random() * (NoVM - 1)

Where, Index is to the selected VM, random() that returns a random value between 0 and 1 and NoVM is the total number of available VMs. The proposed system is designed such a way it works in Virtual machine enables the abstraction of an Operating System and Application running on it from the hardware. The interior hardware infrastructure services interrelated to the Clouds is modelled in the simulator by a Datacenter element for handling service requests. These requests are application elements sandboxed within VMs, which need to be allocated a share of processing power on Datacenter’s host components.

(5)

Fig 3 B.Round Robin Algorithm

Though the algorithm is very simple, there is an additional load on the scheduler to decide the size of quantum and it has longer average waiting time and low throughput. The Round Robin algorithm mainly focuses on distributing the load equally to all the resource. Using this algorithm, the scheduler allocates one VM to a node in a cyclic manner. The round robin scheduling in the cloud is very similar to process scheduling. Then the scheduler starts with a task and moves on to the next task, after a VM is assigned to that task. Restate until all the nodes have been allocated at least one VM and then the scheduler returns to the first task again. Hence, in this case, the scheduler does not wait for the exhaustion of the resources of a node before moving on to the next task. As an example, if there are three job and three VMs are to be scheduled, each task would be allotted one VM, provided all the nodes have enough available resources to run the VMs

Index-> (index+1) mod NoVM

Fig 4 C.Minimum Completion Time Algorithm

The Minimum Completion Time job scheduling algorithm attempts to allocate the selected job to the available VM that can offer the minimum completion time taking into account its current load. The main criterion to determine the VM in the minimum completion time scheduling algorithm is the processor speed and the current load on each VM. In this algorithm first scans the available VMs in order to determine the most appropriate machine to perform the job. Subsequently, it dispatches the job to the most suitable VM and starts execution. Index-> Min{v.getready()+cl.length/v.speed|vVML}

Fig. 5

D. Opportunistic Load Balancing Algorithm

In this algorithm attempts to dispatch the selected job to the available VMs which has the lowest load compared to the other VMs. The idea is to scale the current loads for each VM before sending the job. Then, the VM that has the minimum load is selected to run the job. Assigns a task to the machine that becomes available next, without considering that the execution time of the task on that machine. when multiple machines become available at the same time, then one is arbitrarily selected.

Index -> Min{v.getready()|v VML

Fig. 6 VI.CONCLUSION

With the rapid development of versatile cloud services, a lot of new challenges have emerged. One of the most important problems is how to securely delete the outsourced data stored in the cloud severs. In this paper, aims to share data, calculations and services transparently among users of a massive grid and it offers tremendous opportunities to solve large-scale logical problems

REFERENCES

[1] Chun-Wei Tsai, Wei-Cheng Huang, Meng-Hsiu Chiang, Ming-Chao Chiang, and Chu-Sing Yang” A Hyper-Heuristic Scheduling Algorithm for Cloud” IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. 2, NO. 2, APRIL-JUNE 2014.

[2] Luca Ferretti, Fabio Pierazzi, Michele Colajanni, and Mirco Marchetti ” Scalable rchitecture for Multi-User Encrypted SQL Operations on Cloud Database Services” IEEE RANSACTIONS ON CLOUD COMPUTING, VOL. 2, NO. 4, OCTOBER-DECEMBER 2014.

(6)

[3] Tingting Wang, ZhaobinLiu , Yi Chen, Yujie Xu “Load Balancing Task Scheduling based on Genetic Algorithm in Cloud Computing” 2014 IEEE 12th International Conference on Dependable, Autonomic and Secure Computing [4] Jianhua Gu, Jinhua Hu” Jinhua Hu” A New Resource Scheduling Strategy Based on Genetic Algorithm in Cloud Computing Environment”IEEE VOL. 7, NO. 1, JANUARY 2012.

[5] Kun Li, Gaochao Xu, Guangyu Zhao, Yushuang Dong, Dan Wang” Cloud Task scheduling based on Load Balancing Ant Colony Optimization” 2011 Sixth Annual ChinaGrid Conference978-0-7695-4472-4/11 $26.00 © 2011 IEEE.

[6] Xu Wang, Beizhan Wang, Jing Huang,” Cloud computing and its key techniques”, ©2011 IEEE,pp 404-410

[7] Praveen K. Gupta, Nitin Rakesh,” Different Job Scheduling Methodologies for Web Application and Web Server in a Cloud Computing Environment” © 2010 IEEE

[8] Karthik Kumar, Jing Feng, Yamini Nimmagadda, and Yung-Hsiang Lu,” Resource Allocation for Real-Time Tasks using Cloud Computing”, ©2011 IEEE, Electrical and Computer Engineering, Purdue University, West Lafayette, IN, 47907pp 1-7

References

Related documents

Oracle Spatial 10g [2] also provides spatial data mining feature where they use spatial attributes as nearest neighbor aggregate and within- distance aggregate attributes in

Al-Hazemi (2000) suggested that vocabulary is more vulnerable to attrition than grammar in advanced L2 learners who had acquired the language in a natural setting and similar

• The pre-stressing reinforcement generates a constantly distributed moment distribution which is linearly depending on the stress of the strands and, if properly designed, leads to

Therefore, many efforts have been devoted to solve most optimal Job Shop Scheduling Problems (JSSP), as most of the researches aimed at minimizing the maximum completion time. JSSP

This study presents the experimental results of the equatorial ionospheric plasma drift zonal velocity obtained from Incoherent Scatter Radar (ISR) observations for

Product Name Technical Licences Technical Licenses Required/ Optional GIS 8.0 Required GIS_INTERACTIONSERVICE 8.0 Required ics_custom_media_channel 8.0

Advanced Driver Assistance Systems and the Elderly: Knowledge, Experience and Usage Barriers..