Dynamic Lightweight Job Scheduling Using Job Grouping at Cloud Datacenter

(1)

Dynamic Lightweight Job Scheduling Using Job Grouping at Cloud Datacenter

Jalpa Patel Jaimin Dave Richa Sinha

M.E.,Computer Engneering Asst. Prof. IT Department Asst. Prof. IT Department A.C.E.T, Gujarat, India A.C.E.T, Gujarat, India K.I.R.C, Gujarat, India.

[email protected] [email protected] [email protected]

Abstract: Cloud computing is a highly scalable distributed computing platform in which computing resources are offered 'as a service' leveraging virtualization. In computational cloud main emphasis is given on resource management and job scheduling. The main goal of scheduling is to minimize processing time of the jobs and maximize the utilization of resources. Various research works has been done on job scheduling problem in cloud, but still further analysis and research needs to be done to improve the performance of scheduling algorithm in computational cloud. In this report efficient job- grouping based approach has been proposed for light weight job scheduling in computational Cloud.

In our scheduling algorithm jobs are scheduled based on resources computational and communication capabilities. Independent lightweight jobs are grouped together based on the chosen resources characteristics, to maximize resource utilization and minimize processing time. Hence in this paper, we have specifically focused on improving computational cloud performance.

Keywords: Cloud Computing, Job Grouping, Scheduling, Algorithm.

I. INTRODUCTION

According to Buyya et. al. [12] a cloud is a type of parallel and distributed system consisting of a collection of interconnected and virtualized computers that are dynamically provisioned and presented as one or more unified computing resources based on service-level agreement. Cloud computing is the use of computing resources that are delivered as a service over the network. Due to the fast development of the Cloud Computing technologies, the rapid increase of cloud services are became very remarkable. Using Virtualization, Cloud Server fulfills the needs of cloud users by utilizing its available set of hardwires, resources, operating systems, software, databases, data pools etc. Scheduling theory for cloud computing is gaining consideration with day by day increase in cloud fame. There exist several applications which contain large number of lightweight jobs. The overall processing undertaking of these applications involves high overhead time and cost in terms of job transmission to and from cloud resources and job processing at the cloud resources. Therefore, there is a need for an efficient dynamic scheduling system. Ineffective utilization of resources leads to reduced efficiency and increased running cost of the jobs.

To overcome the problem of high overhead time and cost in terms of job transmission to and from Cloud resources and job processing at the Cloud resource, there is a need for an efficient job grouping-based scheduling system to dynamically assemble the individual fine-grained jobs of an application

into a group of jobs, and send these coarse-grained jobs to the Cloud resources. This dynamic grouping should be done based on the processing requirements of each application, Cloud resources’ availability and their processing capability in cloud computing.

This paper is organized as follows. Section II, discusses related work, Section III presents the proposed model, Section IV shows experimental evaluation and Section V gives conclusion and future work and lastly, the references.

II. RELATED WORK

In this section the summary of different related work approaches as well as problems associated with them are discussed. A Dynamic Job Grouping-Based Scheduling for Deploying Applications with Fine-Grained Tasks on Global Grids [3] Author has proposed algorithm for application with light weight job. They have tried to reduce cost and communication time between jobs and resources. However the algorithm does not take the dynamic resource characteristics into account and it does not pay attention to the network bandwidth of the resources. A Secure Resource and Job scheduling Model with Job Grouping strategy in Grid Computing [5] Author has added security in the form of authentication of user by registering user as well as available resources. Job grouping is start after the registration. Grouping of job is base on the capacity of the resources but here resources are selected in FCFS order, there is no priority for selecting resources. A Time-minimization Dynamic Job Grouping-based Scheduling in Grid Computing [7] Author has extended the concept of grouping based job scheduling. They have change grouping strategies of the job but the scheduling strategy is not ensuring that the resource having a sufficient bandwidth to send the group jobs within required time and resource are selected in FCFS order, there is no priority for selecting resources. Grouping-Based Job Scheduling Model in Grid Computing” [8] Author was grope the jobs base on more than one parameter. First sort the job and start grouping them.

This paper mainly focuses on light weight job scheduling, how they are grouped and allocated to resources in dynamic environment but resource are selected in FCFS order; there is no priority for selecting resources. Improved Cost-Base Algorithm for Task Scheduling in Cloud Computing [10]

Schedule task groups in cloud computing platform, where resources have different resource costs and computation performance. Due to job grouping, communication of coarse-

(2)

grained jobs and resources optimizes (computation / communication) ratio. For this purpose, an algorithm based on both costs with user task grouping is proposed but the scheduling strategy is not ensuring that the resource having a sufficient bandwidth to send the group jobs within required time.

III. PROPOSED SCHEDULING MODEL

A. Job Grouping Scheduling Model

There are four basic elements for cloud scheduling are user, job scheduler, Cloud Information System (CIS) and resources.

User jobs submitted to the Cloud scheduler for scheduling to the resources with an objective of minimizing the processing time and utilizing the resources effectively. The scheduling model shows in Fig. 1.

Fig. 1. Job Grouping Scheduling Model

The job scheduler is a service that resides in a user machine.

When the user creates a list of jobs in the user machine, these jobs are sent to the job scheduler for scheduling. The job scheduler obtains information of available resources from the Cloud Information Service (CIS). Based on this information, the job scheduling algorithm is used to grouping the jobs and then resource selection for grouped jobs. When all the jobs are put into groups with selected resources, the grouped jobs are dispatched to their corresponding resources for computation by the dispatcher. The Cloud Information Service (CIS) provides information about all the registered resources in a Cloud. This service keeps track of all of the resources characteristics in the cloud. CIS collects resource characteristic information like operating system, system architecture, processing capability, network bandwidth and processing cost.

It also provides users the availability information of the resources. The Data Center Broker collects information from the Cloud Information Service (CIS). It assembles the resource availability and processing capability to the resource information table. It also gathers information of the network bandwidth and processing cost of each listed resource provided by the CIS. The Data Center Broker is used by

Global Scheduler to gather necessary information to perform job grouping. The Global Scheduler is responsible for grouping of job based on information collected by the Data Center Broker from CIS. In the job grouping process, user submitted jobs are collected by scheduler and jobs are grouped based on the selected available resource characteristics. The process iteratively performed until all the jobs are grouped according to corresponding resources. The Local Scheduler acts as a sender that sends grouped jobs to their respective resources. The Local Scheduler forwards the grouped jobs based on the schedule made by the Global Scheduler. The Local Scheduler also collects the results of the completed jobs from the resources.

B. Architecture of Job Scheduler

The architecture of the job scheduler system is described in Fig. 2. The system accepts jobs from the cloud users specified by their JOB_ID, JOB_LENGTH (in Million Instructions (MI)), JOB_MS (in Mb) and total number of jobs submitted by the user.

Fig. 2. Architecture of Job Scheduler

After gathering details of user jobs, scheduler collects all the available computational cloud resources information specified by their RES_ID, RES_ MIPS (Computational power of the resource in Million Instructions per Second), RES_BW (in Mb/sec.), and RES_COST (in cost/sec.). After gathering the details of user jobs and the available resources, the scheduler will select a resource and multiplies the resource MIPS with the given granularity time [3], which is the time within which a job is processed at the resource. The value of this calculation produces the total Million Instructions (MI) for that particular resource to process within a particular granularity time.

The system selects jobs, and then jobs are grouped based on the resulting total MI of resource and bandwidth. New IDs are assigned to grouped jobs and scheduler submits the job groups

(3)

to their respective resources for computation. After executing the group job, results goes to back to the corresponding users and the resource is again available to scheduler system.

C. Algorithm Description

The terms that are used through this algorithm and their definitions are listing below in Fig. 3.

Fig. 3. List of Terms and Their Definition

In this section present a job grouping base light weight job scheduling algorithm. The algorithm contains two phase:

(1) Create job list and available resource list (2) Job grouping and scheduling.

Job, resource AND Group information are given below JOBi=<JOB_IDi, JOB_MIi, JOB_MSi>

RESj=<RES_IDj, RES_MIPSj,RES_BWj,RES_COSTj>

G_JOBk=<G_IDk ,G_MIk ,G_MSk>

Grouping of job is done based on the processing capabilities and network band width of available resources. First user submitted jobs are collected and one job list will be created, then scheduler collects available resource characteristics.

Scheduler select resource for scheduling job based on following conditions:

G_MIj + JOB_MIi < RES_MIPSj * Granularity Time…… (1) G_MSj + JOB_MSi)/RES_BWj) < Graularity Time……… (2) Equation (1) specifies that the processing requirement of the grouped job should not exceed the resource processing capability within a specified granularity size; equation (2) the total transfer time of the grouped jobs should not exceed total processing time of the group jobs.

This two are the main constraints in job grouping strategy to achieve minimum job processing time and high resource utilization in computational cloud system.

After collecting user jobs scheduler will create a job list and available resource information will be gathered using following algorithm:

Algorithm 1 List of Jobs and the Available Resources 1: Submit user jobs to cloud

2. Create a list JOB_LIST

3. Sort the JOB_LIST in descending order with respect to MI 4. Create a list RES_LIST

5. Collect all available resource characteristics from CIS and add them to RES_LIST

6. Sort the RES_LIST in descending order with respect to MIPS

After listing resources and jobs following algorithm will be used for job grouping and scheduling:

Algorithm 2 Job Grouping and Scheduling

1. Receive job list, JOB_LIST //User Job list waiting to schedule

3. Receive resource list, RES_LIST. //List of available cloud resources

4. for i: = 1 to N do 5. for j: =1 to M do

6. G_MI:= 0; //Total job length of grouped job

7. G_MS:=0; //Total job memory size of grouped job in Mb 8. RES_MIj := RES_MIPSj*Granularity_Size;

9 while G_MI < RES_MIj &

G_MS /RES_BWj<= Granularity_Size & i < N do

10. G_MI:= G_MI + JOB_MIi;

11. G_MS:=G_MS+ JOB_MSi;

12. i++;

13. endwhile 14. i--;

15. if G_MI >RES_MIj or G_MS /RES_BWj >=

Granularity_Size then

16. G_MI:= G_MI - JOB_MIi;

MI : Million Instructions or processing requirements of a user job

MIPS : Million Instructions per Second or processing capabilities of resource MS : Memory size of user job

N : Total number of user jobs

M : Total number of available resources JOB_LIST : List of user jobs submitted to the broker RES_LIST : List of available cloud resources JOB_IDi : Job ID of job i

JOB_MIi : Job length of job i (in MI) JOB_MSi : Input file size of job i (in Mb) RES_IDj : Resource ID of resource j

RES_MIPSj : Processing capability of resource j in (MIPS) RES_BWj : Network bandwidth of resource j in (Mb/s) RES_COSTj : Processing cost of the resource j(in cost/s) G_IDk : Assign ID of grouped job k

G_MIk : Total job length of grouped job k G_MSk : Total input file size of grouped job k

(4)

17. G_MS:=G_MS- JOB_MSi;

18. i--;

19. endif

20. while G_MI < RES_MIN &

G_MS /RES_BWN<= Granularity_Size & i < N do

21. G_MI:= G_MI + JOB_MIN;

22. G_MS:= G_MS + JOB_MIN

23. N--;

24. endwhile 25. N++;

26. if G_MI >RES_MIN or G_MS /RES_BWN >=

Granularity_Size then

27. G_MI:= G_MI - JOB_MIN; 28. G_MS:= G_MS - JOB_MIN;

29. N--;

30. endif

31. Create a new job with total MI equals to G_MI and total MS equals to G_MS;

32. Assign the newly created G_JOBi to Target RES_LISTj for computation;

33. Receive computed G_JOBi from RES_LISTj; 34. i++;

35. Endfor;

36. Endfor;

37. End;

The overall explanation of Algorithm 2 is as follows: once the scheduler gathers the characteristics of the available cloud resources and jobs using algorithm 1. Then, scheduler selects a job i from JOB_LIST and available resource j from RES_LIST, if job i satisfies the conditions (1) and (2), then scheduler add job i to Job Group j, else start to add job from the end of the job list and then again check conditions (1) and (2). This process continues until JOB_LIST is empty. The scheduler then sends the job groups to their respective resources for computation. The cloud resources process the received groups of jobs and send back the computed groups of job to the scheduler. The scheduler then gathers the computed job groups and split the output before sending to the user.

D. Flowchart

Fig. 4 describe the flow of proposed method. User creates a list of jobs in the user machine; these jobs are sent to the job scheduler for scheduling. The job scheduler obtains information of available resources from the Cloud Information Service. Based on this information, the job scheduling algorithm is used to grouping the jobs and then selecting appropriate resource for computing grouped job. There are two conditions are checks for grouping jobs. One condition checks that resource processing capability not exceeds to execute group of the jobs and second condition checks transfer time of groups of job is not exceeds.

The Scheduler makes grouped jobs base on two conditions.

The Scheduler forwards the grouped jobs to particular resource for computing grouped jobs. The Scheduler also collects the results of the completed jobs from the resources.

The flow of algorithm continues until all jobs and all resources are checked.

IV. EXPERIMENTAL EVALUATION

A. Environment Setup

Table 1. Environment Setup for Implementation

Entity Tool/Software

Operating System Windows 7 Professional (32- bit)

Simulation Engine Cloudsim 3.0.3

Front-end IDE NetBeans 7.3.1

Programming Language Java

(5)

Fig. 4. Dynamic Light Weight Job Scheduling Strategy Flowchart

B. System Model

For experimental purposes we assume that the cloud consists of five resources. In general, each resource contains one

computing node (Machines), and each computing node contains one Processing Element (PE). The processors of computing nodes in different resources have different processing power (in MIPS).

Table 2. Cloud resources setup for Implementation

Resource MIPS

R1 160

R2 280

R3 420

R4 200

R5 210

C. Application Model

Jobs which are submitted to the cloud are independent tasks and no required order of execution. The computational requirement (Job length) of each job is presented in Millions Instructions (MI). In CloudSim, jobs are created and their requirements are defined through Cloudlet objects. A Cloudlet is a package that contains all the information related to the job and its execution management details such as the job length (in Million Instructions (MI)), the size of input files, and the job originator (user information). The characteristics of the Cloudlets are given in below Table 3.

Table 3. Cloudlet Characteristics

Cloudlet Characteristics Average Length MI of

Cloudlets

150 MI Memory size of Cloudlets 20 to 26 (MB) D. Comparisons

In order to evaluate the performance of proposed algorithm, a set of experiments are conducted to measure total processing time and resource utilization. We compared the results of my proposed Dynamic Light Weight Job Scheduling using Job Grouping algorithm (DLWJG) with Secure resource and job scheduling model with job grouping (SRJM) [5], First come first serve (FCFS) and Round Robin (RR). In the simulation, we performed scheduling experiments by setting different values to the number of jobs, the number of job is varied from 10 to 50 and granularity time for the job grouping activity is defined as 5 time unit.

 Processing Time

Dynamic light weight job scheduling using job grouping algorithm gives best performance along the existing algorithms.

Table 4. Comparison of Total Processing time

Number of Cloudle

DLWJG with job grouping

SRJM[5 ]

Round Robin

Without job grouping

(FCFS)

(6)

t Processin g Time

Processi ng Time

Processin g Time

Processing Time

10 3.57 5.77 11.85 22.71

15 5.56 8.33 25.53 48.41

20 8.33 11.56 46.96 80.25

25 11.37 22.47 70.04 80.25

50 33.43 75.09 292.46 464.76

Fig. 5. Processing time Vs Cloudlets Comparison

 Resource Utilization

Fig. 6 depicts utilization of resource by Dynamic light weight job scheduling using job grouping (DLWJG) is higher than all other algorithms.

Table 5. Comparison of Resource Utilization

Numb er of Cloudl

et

DLWJG with job grouping

SRJM[5 ]

Round Robin

Without job grouping(FC

FS) Utilizati

on

Utilizati on

Utilizati

on Utilization

10 71.43 38.49 13.21 13.93

15 55.65 41.4 13.2 14.29

20 83.33 46.29 13.4 13.93

25 75.79 44.35 13.2 14.14

50 91.59 44.49 13.21 13.93

Fig. 6. Utilization Vs Cloudlets Comparison

 Implementation with different granularity time

Implementation is conducted using different granularity time to analyze total computation time taken to complete execution of 50 cloudlets.

Table 6 and Fig. 7 depict the result of simulation carried out using different granularity time.

Table 6. Implementation result proposed method with different granularity time

Resourc e

Granularit y time = 5

sec

Granularit y time = 10 sec

Granularit y time = 15 sec

Granularit y time = 20 sec Processin

g Load(MI)

Processing Load(MI)

Processin g Load(MI)

R1 800 - - -

R2 1400 2750 1250 -

R3 2100 4100 6250 7500

R4 900 - - -

R5 1050 - - -

Fig. 7. Resource Vs Processing Load (MI) for proposed method

Fig. 7, it is clear that increasing granularity size results poor resource utilization. Therefore, during the job grouping activity granularity size should be determined based on number of jobs, jobs computational requirements and resources computational capabilities.

V. CONCLUSION AND FUTURE WORK

In this paper we have discussed about the problem of job scheduling in computational cloud, where user submits jobs with a large number of lightweight jobs and we have tried to find a solution for that problem. We have proposed a Dynamic Light Weight Job Scheduling using Job Grouping. We have also compared Dynamic light weight job scheduling using job grouping (DLWJG) with First come first serve (FCFS without job grouping), Round Robin and Secure resource and job scheduling model with job grouping strategy (SRJM) [5]. We have concluded that DLWJG give better computation time and utilization compared to FCFC, Round Robin and SRJM [5].

In the future, this work can be extended to implement some more factors like current load of the resource, jobs with a deadline, network delay, QoS (Quality of Service) 0

100 200 300 400 500

10 15 20 25 50

Processing Time

Cloudlets

DLWJG SRJM Round Robin FCFS

0 50 100

10 15 20 25 50

Utilization

Cloudlets

DLWJG SRJM Round Robin FCFS

0 2000 4000 6000 8000

R1 R2 R3 R4 R5

Processing Load (MI)

Resources

5 sec 10 sec 15 sec 20 sec

(7)

requirements will be taken into account to increase the performance of the cloud system.

REFERENCES

[1] P. Mell and T. Grance. The nist definition of cloud computing (draft).

National Institute of Standards and Technology, 53:7, 2010.

[2] I. Foster, Y Zhao, I. Raicu, and S. Lu, “Cloud Computing and Grid Computing 360-degreecompared[C]”, in Grid Computing Environments Workshop, 2008, pp. 1-10.

[3] Nithiapidary Muthuvelu, Junyang Liu, Nay Lin Soe, Srikumar Venugopal,Anthony Sulistio and Rajkumar Buyya1 ,” A Dynamic Job Grouping-Based Scheduling for Deploying Applications with Fine- Grained Tasks on Global Grids ”, Australasian Workshop on Grid Computing and e-Research (AusGrid2005).

[4] CloudComputingBasics,http://south.cattelecom.com/rtso/Technologies/

CloudComputing/0071626948_chap01.pdf

[5] Sharma, Raksha, et al. "A Secure Resource and Job scheduling model with Job Grouping strategy in Grid computing." Computer Science and Information Technology (ICCSIT), 2010 3rd IEEE International Conference on. Vol. 4. IEEE, 2010.

[6] The Future of Cloud computing,

http://cordis.europa.eu/fp7/ict/ssai/docs/cloud-report-final.pdf.

[7] Manoj Kumar Mishra, Prithviraj Mohanty, G. B. Mund, “ A Time- minimization Dynamic Job Grouping-based Scheduling in Grid Computing”, International Journal of Computer Applications (0975 – 8887) Volume 40– No.16, February 2012

[8] Vishnu Kant Soni, Raksha Sharma, Manoj Kumar Mishra,“Grouping- Based Job Scheduling Model In Grid Computing”, World Academy of Science, Engineering and Technology 2010

[9] Cloudcomputinglayers,https://developers.google.com/appengine/training /intro/whatiscc.

[10] Mrs.S.Selvarani, Dr.G.Sudha Sadhasivam, “Cost-Base Algorithm for Task Scheduling in Cloud Computing”, IEEE 2010.

[11] Pinal Salot ,” A Survey of Various Scheduling Algotithm in Cloud Comuputing Environment”, IJRET Volume: 2 Issue: 2 , FEB 2013 [12] Rajkumar Buyya, Rajiv Ranjan, Rodrigo N. Calheiros, “Modeling and

Simulation of Scalable Cloud Computing Environments and the CloudSim Toolkit: Challenges and Opportunities”, in The 2009 International Conference on High Performance Computing and Simulation, HPCS2009, pp.1-11.

[13] Fan, Zongqin, Hong Shen, Yanbo Wu, and Yidong Li. "Simulated- Annealing Load Balancing for Resource Allocation in Cloud Environments."

[14] Calheiros, Rodrigo N., Rajiv Ranjan, Anton Beloglazov, César AF De Rose, and Rajkumar Buyya. "CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms."Software: Practice and Experience 41, no. 1 (2011): 23-50.