121
Copyright © 2011-15. Vandana Publications. All Rights Reserved.
Volume-5, Issue-2, April-2015 International Journal of Engineering and Management Research
Page Number: 121-125
Fictional Simulation
Shehin Shams P1, Suchithra M B2, Swathy V S3, Vineesha K V4, Mrs Nitha K P5 1,2,3,4 Department of CSE, Vidya Academy of Science and Technology, Thrissur, INDIA Assistant Professor, Department of CSE, Vidya Academy of Science and Technology, Thrissur, INDIA
ABSTRACT
Grid technologies have progressed towards a service-oriented paradigm that enables a new way of service provisioning based on utility computing models. One of the most challenging problems in grid environment is workflow scheduling. Appropriate scheduling algorithm is selected by workflow management systems. Different scenarios require different scheduling algorithms. The selection of a particular scheduling algorithm depends upon various factors like the parameter to be optimized (cost or time), quality of service to be provided and information available regarding various aspects of job. In this survey, we investigate existing workflow scheduling algorithms.
I.
INTRODUCTION
Grid computing, also called computational grid, was developed by computer scientists in the mid-1990s based on the inspiration of the electrical power Grid’s pervasiveness, ease of use, and reliability, to provide a computational power grid infrastructure for wide-area parallel and distributed computing. The motivation for computational Grids was initially driven by large-scale, resource (computational and data) intensive scientific applications that require more resource than a single computer (PC, workstation, supercomputer, or cluster) could provide in a single administrative domain.
As a computing infrastructure, a Grid enables the sharing, selection, and aggregation of a wide variety of geographically distributed resources owned by different organizations for solving large-scale resource intensive problems in various fields. In order to build a Grid, the development and deployment of a number of services is required. They include low-level services such as security, information, directory, resource management (resource trading, resource allocation, quality of services) and high-level services for application development, resource management and scheduling (resource discovery, access cost negotiation,
resource selection, scheduling strategies, quality of services, and execution management). Among them, the two most challenging aspects of Grid computing are resource management and scheduling. A group of Quality-of-Service (QoS) driven algorithms are presented for the management of resources and scheduling of applications.
There are several types of grids available such as community grids, data grids, utility grids and so on. Table 1 shows some differences between community Grids and utility Grids in terms of availability, Quality of Services (QoS) and pricing. In utility Grids, users can make a reservation with a service provider in advance to ensure the service availability, and users can also negotiate with service providers on service level agreements for required QoS. Compared with utility Grids, service availability and QoS in community Grids may not be guaranteed. However, community Grids provide free access, whereas users need to pay for service access in utility Grids. In general, the service pricing is based on the QoS level and current market supply and demand.
122
Copyright © 2011-15. Vandana Publications. All Rights Reserved.
energy efficiency, some focuses on load balancing or somefocuses on a combination of these parameters.Workflow is usually used to represent by the directed acyclic graphs (DAG).In majority of the grid environments recently used the Simgrid tool as simulator.It is based on java programming language.grid sim was used before that.it was developed using basic c programming language.Due to it’s some of the limitations Simgrid developed.
TABLE 1. COMMUNITY GRIDS vs. UTILITY GRIDS
II.
WORKFLOW SCHEDULING
In workflow scheduling, different sub tasks of a bigger task are allocated resources in such a way that some pre-defined objective criteria is met. There are various problems in bioinformatics, astronomy and business enterprise in which a set of sub tasks is executed in a particular sequence in order to carry out a bigger task. In general, a workflow application requires series of steps to be executed in a particular fashion. These steps have parent child relationship. The parent task should be executed before its child task. The parent task is linked to child task according to set of rules. A workflow application is generally represented as a Directed Acyclic Graph (DAG) such as G (V, E) where V is the number of tasks and E is the information regarding data dependencies among tasks. A task which does not have any parent task is called entry task and a task which does not have any child task is called an exit task.
Figure 1 shows the dependencies among different tasks in a workflow graph G. The parent task 0 is executed before child tasks 1, 2, 3 and 4.The output of parent node acts as an input to child node. The task 0 acts as entry node and task 9 act as an exit node. Task 9 is execute after the completion of tasks 5, 6, 7and 8.
In workflow scheduling, the different tasks are allocated resources (e.g. virtual machines). The workflow scheduling decisions are taken by workflow management systems (WfMS), which works as a broker between users and grid service providers (GSPs). Whenever the WfMS accepts a workflow, it contacts Grid information like the Grid Market Directory (GMD), to query about available services for each task and their QoS attributes. Each GSP has to register itself and its services with the GMD, so that it can present and sell its services to users. Then, the WfMS directly contacts the desired GSPs to query about the free time slots of the suitable services. Using this information, the WfMS can execute a scheduling algorithm to map each task of a workflow to one of the available services. According to the generated schedule, the WfMS contacts GSPs to make advance reservations of
selected services. This results in an SLA between the WfMS and the GSP specifying the earliest start time (EST), the latest finish time (LFT), and the price of the selected service. Usually, the SLA contains a penalty clause in case of violation of the service level to enforce service level guarantees.
Fig.1. A Workflow represented in the form of a graph
III.
SURVEY OF WORKFLOW
SCHEDULING ALGORITHMS FOR GRID
COMPUTING
Many heuristics [1] have been developed to schedule inter-dependent tasks in homogenous and dedicated cluster environments. However, there are new challenges for scheduling workflow applications in a Grid
environment, such as:
• Resources are shared on Grids and many users compete for resources.
• Resources are not under the control of the scheduler.
• Resources are heterogeneous and may not all perform identically for any given task.
• Many workflow applications are data-intensive and large data sets are required to be transferred between multiple sites.
Therefore, Grid workflow scheduling is required to consider non-dedicated and heterogeneous execution environments. It also needs to address the issue of large data transmission across various data communication links. The input of workflow scheduling algorithms is normally an abstract workflow model which defines workflow tasks without specifying the physical location of resources on which the tasks are executed.
There are two types of abstract workflow model, deterministic and non-deterministic. In a deterministic model, the dependencies of tasks and I/O data are known in advance, whereas in a non-deterministic model, they are only known at runtime.
The scheduling algorithms are used by WfMS to find optimal map of workflow tasks and grid resources (virtual machines). The role of workflow scheduling algorithm is to find the schedule which satisfies user’s objectives. Users define their objectives in SLA (Service Level Agreement) document which is written between a grid user and a grid service provider. The user may require multiple objectives to be satisfied such as cost optimization, makespan optimization,
Community Grids
Utility Grids
Availability Best effort Advanced reservation
QoS Best effort Contract/SLA
Pricing Not considered or
free access
Usage, QoS level, Market
123
Copyright © 2011-15. Vandana Publications. All Rights Reserved.
reliability, deadline constrained, budget constrained etc. and itis the role of scheduling algorithm to find the optimal schedule which satisfies user’s objectives.
Generally there are two category of the scheduling algorithm; Static Scheduling and Dynamic Scheduling. In Static Scheduling, Tasks are arrives simultaneously and available resource schedule updated after each task is schedule. In Dynamic Scheduling, task and machine set location and allocation is not going to fix. Dynamic strategy applied in two fashions: On-line mode heuristic scheduling and Batch mode heuristic scheduling. In on-line mode heuristic scheduling, tasks are scheduled when they arrived in the system. In Batch mode, tasks are queued and collected into set when they arrive in the system. The scheduling will start after a fixed period of time. In another way two major types of workflow scheduling are; best-effort based and QoS constraint based scheduling. The best-effort based scheduling attempts to minimize the execution time ignoring other factors such as the monetary cost of accessing resources and various users’ QoS satisfaction levels. On the other hand, QoS constraint based scheduling attempts to minimize performance under most important QoS constraints, for example time minimization
under budget constraints or cost minimization under deadline constraints.
TABLE 1: A BRIEF DESCRIPTION AND COMPARISON AMONG VARIOUS WORKFLOW SCHEDULING
ALGORITHMS
Fig. 2. A taxonomy of Grid workflow scheduling algorithms
Scheduling Algorithm Scheduling Type
Scheduling
Parameters Scheduling Factors Finding
Environ
ment Tools
QoS Guided Min-Min
Heuristic [2] Batch Mode
Quality of service, Make span Quality of service,Make span
Bandwidth of tasks
1. Reduce the Makespan then Min-Min
2. Use only bandwidth parameter for QoS
Grid GridSim
QoS Priority Grouping
Algorithm[3] Batch Mode
Acceptance rate,
completion time Grouped tasks
1. Deadline and acceptance rate of the tasks
2. Makespan
Grid GridSim
Towards Improving QoS-Guided Scheduling[4]
Batch Mode Makespan Grouped tasks[Jobs]
1. Improving makespan to achieve better performance
2. Reduce the Resource Need by Rescheduling
Grid GridSim
QoS based predictive Max-min, Min-min switcher[5]
Batch Mode Makespan Heuristic Better performance with
QoS Grid GridSim
RASA[6] Batch Mode Make span Grouped tasks Use to reduce the
makespan Grid GridSim
HEFT workflow scheduling Algorithm[7]
List scheduling dependency mode
Makespan Highest upward
rank
Reduce make span in a
DAG Grid GridSim
Cost based scheduling on utility grids.[8]
Budget
constrained Cost Task Scheduling
Reschedule the
unexpected tasks Grid GridSim
Task duplicationbased scheduling Algorithm for Network of Heterogeneous systems (TANH)[9]
Duplication based heuristic mode
makespan DAG scheduling Reduced makespan Grid GridSim
Workflow with budget constraints[10]
Budget
constrained Makespan,budget DAG scheduling
Minimize the execution
time and the make span Grid GridSim
Ant colony Optimization Based Workflow
Scheduling[11]
Meta-heuristic Resource
Utilization,time QOS
Optimizes the service
flow scheduling Grid GridSim
Selective Rescheduling
policy[12] Static heuristic Cost and makespan
Minimal spare time and the slack
Selectively reschedule
124
Copyright © 2011-15. Vandana Publications. All Rights Reserved.
Predict Earliest Finish
Time (PEFT)[13] List scheduling time
Scheduling Length Ratio, Efficiency, pair-wise comparison of the number of occurrences of better solutions and Slack.
Reduced scheduling time Grid GridSim
Genetic algorithm[14] Meta-heuristic Deadline,budget QoS Reduced timeand cost Grid GridSim
List Scheduling [15] Heuristics Makespan, Load
Balance DAG scheduling
Optimized makespan and
load balance Grid GridSim
Genetic algorithm[16] Meta-Heuristic Budget Constrained QoS
Minimizes the execution time while meeting a specified user budget
Grid GridSim
DCP (Dynamic Critical
Path) [17] Heuristic Resource Availability Priority of tasks
Better performance where resource availability changes frequently
Grid GridSim
Improved Critical Path using Descendant Prediction (ICPDP) algorithm [18]
Hybrid-heuristic Makespan and load
balance Available resources
Makespan minimization and improve the utilization of resources.
Grid GridSim
Particle Swarm
Optimization[19] Meta-Heuristic
Makespan, Cost and
Reliability Grouped Tasks
Minimizes execution time, cost, and maximize the reliability.
Grid GridSim
Genetic Algorithm[20] Meta-Heuristic Makespan and Cost
Optimization QoS
Multi-Objective Differential Evolution (MODE) that optimize both cost and makespan for workflow application.
Grid GridSim
Particle Swarm Optimization- Rotary Hybrid Discrete Particle Swarm Optimization (RHDPSO) algorithm [21]
Meta-Heuristic Makespan, Cost and
Load Balance Grouped Tasks
Optimize the makespan, cost and perform load balancing when scheduling workflow application.
Grid GridSim
Novel DBC (Deadline and Budget
Constrained) [22]
Heuristic Deadline and Budget
Constrained Task Scheduling
Novel DCP is compared with DCP. The
experiment results show that the workflow completion ratios of Novel DCP are higher than DCP.
Grid GridSim
HGreen Algorithm[23] Heuristic Energy Efficient
Schedules the heavier tasks on maximum green resources..
The simulation results have shown that the H-Green algorithm reduce the power consumption in global grids
Grid GridSim
List Scheduling
Algorithm[24] Heuristic
Makespan, Economic Cost, Energy Consumption, Reliability
DAG scheduling
It outperform as compared with bi-criteria heuristic and bi-criteria genetic algorithms
Grid GridSim
PCP (Partial Critical
Path) [25] Heuristic
Deadline-Constraint,
Cost Minimization QoS
PCP algorithm
minimizes the execution time while meeting the user defined deadline.
Grid GridSim
125
Copyright © 2011-15. Vandana Publications. All Rights Reserved.
Following is a brief description of these signs:: Tick sign means that work has already been done in that
area and there is a workflow scheduling algorithm for solving that type of problem.
? : Question mark sign means that there is a need to explore workflow scheduling algorithm for that particular domain focusing on different aspects like cost optimization, deadline constrained, budget constrained, reliability, load balance, availability and energy efficient.
IV.
CONCLUSIONS
In this paper, we surveyed various existing workflow scheduling algorithms and tabulated them on the basis of nature of scheduling algorithm, type of algorithm, objective criteria and the environment to which the workflow scheduling algorithm was applied. From the literature reviewed, it is clear that lot of work has already been in the area of workflow scheduling but still there are many areas which require further attention e.g. there is a need to explore energy efficient genetic algorithm for workflow application whereas cost and deadline constraints have already been addressed using genetic algorithms.
REFERENCES
[1] Y. K. Kwok and I. Ahmad, “Static Scheduling Algorithms for Allocating Directed Task Graphs to Multiprocessors”, ACM Computing Surveys, 31(4):406-471, Dec.1999
[2] XiaoShan He,Xianhe Sun and Gergor von Laszewski.,“QoS guided Min-Min heuristic for grid task scheduling”, Journal of Computer Science and Technology, 18(4), p.442-451, 2003.
[3] Dong. F, Luo. J, Gao. L and Ge. L, “A Grid Task Scheduling Algorithm Based on QoS Priority Grouping,” In the Proceedings of the Fifth International Conference on Grid and Cooperative Computing (GCC’06), IEEE, 2006.
[4] Ching-Hsien Hsu,Zhan. J.,Wai-Chi Fang, et al. “Towards improving QoS-guided scheduling in grid” ,Third ChinaGrid Annual Conference(CHINAGRID), Dunhuang, Gansu, China, , p.89-9, 2008.
[5] M.Singh and P.K.Suri; “QPSMax-Min<>Min-Min : A QoS Based Predictive Max-Min, Min-Min Switcher Algorithm for Job Scheduling in a Grid”, International Technology Journal7(8) :p.1176-1181,2008 .
[6] Saeed Parsa and Reza Entezari-Maleki,” RASA: A New Task Scheduling Algorithm in Grid Environment” , World Applied Sciences Journal 7(Special Issue of Computer & IT): 152-160, 2009
[7] Wieczorek, M., Prodan, R. and Fahringer, T. ‘‘Scheduling of scientificworkflows in the ASKALON grid environment’’, SIGMOD Rec., 34(3), pp.56–62 (2005).
[8] Yu, J., Buyya, R. and Tham, C.K. ‘‘Cost-based scheduling of scientific wokflow applications on utility grids’’, First Int’l Conference on e-Scienceand Grid Computing, Melbourne, Australia, pp. 140–147 (2005).
[9] R. Bajaj and D.P. Agrawal, “Improving Scheduling of Tasks in aHeterogeneous Environment,” IEEE Trans. Parallel and Distributed Systems, vol. 15, no. 2, pp. 107-118, Feb. 2004 .
[10] Sakellariou, R., Zhao, H., Tsiakkouri, E. and Dikaiakos, M.D. ‘‘Scheduling workflows with budget constraints’’, In Integrated Research in GRID Computing, S. Gorlatch and M. Danelutto, Eds Springer- Verlag., pp.189–202, (2007). [11] W.N. Chen and J. Zhang, “An Ant Colony Optimization Approach to Grid Workflow Scheduling Problem with Various QoS Requirements,” IEEE Trans. Systems, Man, and Cybernetics,vol. 39, no. 1, pp. 29-43, Jan. 2009.
[12] R. Sakellariou and H. Zhao, “A Low-Cost Rescheduling Policy forEfficient Mapping of Workflows on Grid Systems,” ScienceProgramming, vol. 12, pp. 253-262, Dec. 2004. [13] Hamid Arabnejad and Jorge G. Barbosa,” List Scheduling Algorithm for Heterogeneous Systems by an Optimistic Cost Table”,IEEE,2008.
[14] J. Yu and R. Buyya, “Scheduling Scientific Workflow Applications with Deadline and Budget Constraints Using GenetiCc Algorithms,” Scientific Programming, vol. 14, nos. 3/4, pp. 217-230, 2006.
[15] A. Mandal, K. Kennedy, C. Koelbel, G. Marin, J. Crummey and B. Liu,” Scheduling Strategies for Mapping Application Workflows onto the Grid.”, High Performance Distributed Computing, 14th IEEE International Conference,2005.
[16] Jia Yu and Raj Kumar Buyya.” A Budget Constrained Scheduling of Workflow Applications on Utility Grids using Genetic Algorithms.” Workflows in Support of Large-Scale Science, IEEE Conference, Pg. 1-10, 2006.
[17] M. Rahman, S. Venugopal and R. Buyya. “A Dynamic Critical Path Algorithm for Scheduling Scientific Workflow Applications on Global Grids.” E-Science and Grid Computing, IEEE International Conference, 2007Pg. 35-42. [18] Bogdan Simion, Catalin Leordeanu, Florin Pop and Valentin Cristea.”A Hybrid Algorithm for Scheduling Workflow Applications in Grid Environments”. OTM Confederated International ConferencesPg. 1331-1348, 2007. [19] Fli Tao, Dongming Zhao, Yefa Hu and Zude Zhou. “Resource Service Composition and Its Optimal Selection Based on Particle Swarm Optimization in Manufacturing Grid System.” Industrial Informatics, IEEE Transactions, Pg. 315-327,2008.
126
Copyright © 2011-15. Vandana Publications. All Rights Reserved.
[21] Qian Tao, Hui You Chang, Yang Yi, Chunqin Gu andYang Yu. “QoS Constrained Grid Workflow Scheduling Optimization Based on a Novel PSO Algorithm”, Grid and Cooperative Computing. 8th IEEE International Conference, Pg. 153-159, 2009.
[22] Yong Wang, R. M. Bhati and M. A. Bauer. ,”A Novel Deadline and Budget Constrained Scheduling Heuristic for Computation Grids”. Journal of Central South University of Technology Vol. 18, Issue 2, Pg. 465-472, 2011..
[23]R.Bajaj, “Workflow Scheduling Algorithm for Optimizing Energy Efficient Grid Resources Usage, Dependable. Automic and Secure Computing”, 9th IEEE International Conference, Pg. 642-649.
[24] H. M. Fard, R. Prodan, J. J. D Barrionuevo and T. Fahringer.” A Multi-Objective Approach for Workflow Scheduling in Heterogeneous Environment”,. Cluster, Cloud and Grid Computing 12th IEEE International Conference, Pg. 300-309, 2012.