2.2 Scheduling on Distributed Computing Environments
2.2.1 Scheduling of Bag-of-tasks Applications
There is a lot of work that is concerned with the scheduling of the bag-of-tasks applica- tions on the Grid. Similarity to our work lies in that most of these algorithms assume the irregularity in task sizes of an application, and they rely on the exact information about these sizes. The main difference between the bag-of-tasks applications and the
applications we consider in this thesis (see Section 4.2) is that the individual tasks in bag-of-tasks application are assumed to be sequential, whereas we also consider appli- cations with nested parallelism. Another important difference is that all of the tasks of bag-of-tasks application are available at the beginning of the application execution, so the static scheduling of these tasks may be perfectly viable. In the applications we consider, on the other hand, only the main application task is available initially, and the additional tasks are created dynamically during the application execution.
Armstrong et al. [AHK98] propose several simple algorithms that map tasks of bag- of-tasks application to Grid resources, such as Opportunistic Load Balancing, which maps the first available task to the first available PE and Limited-Best Assignment, which assigns each task to the PE on which it has least-expected run time. Mahesh- waran et al. [MAS+99] propose several more elaborate heuristics for the same problem:
• Min-max heuristic computes, for each task, its minimal execution time on all PEs. After that, the smallest task is placed on the PE which will execute it fastest, and the same procedure is repeated for the rest of the tasks.
• Max-min heuristic differs to Min-max in that, after the calculation of minimal execution time of each task, thelargest task is placed on the PE that will execute it fastest.
• Sufferage heuristics computes what task would suffer the most (in terms of increased execution time) if it is not placed on the PE that will execute it fastest. After this task is found, it is placed on the PE that will execute it the fastest. The procedure is then repeated for the rest of the tasks
Casanova et al. [CLZB00] propose the XSufferage heuristics, which is the improve- ment of Sufferage that takes into account the fact that PEs are grouped into clusters. XSufferage calculates what task would suffer the most if it is not placed on its most preferable cluster (rather than the most preferable PE), and then places this task on the PE from its most preferable cluster that will execute it fastest.
All of the heuristics described above are relatively expensive in terms of computa- tion time they require, and therefore they work well only for applications consisting of a relatively small number of coarse-grained tasks. In order to address the problem of ap- plications comprising a large number of fine-grained tasks, Muthuvelu et al. [MLS+05] propose the heuristics that groups the fine-grained tasks into groups that are placed on the same PE. It takes into account total number of tasks, task sizes, total number of processors and the computing capabilities of processors in order to decide on the
2.2. SCHEDULING ON DISTRIBUTED COMPUTING ENVIRONMENTS 29
grouping of tasks that will accomplish the minimal overall application execution time and maximal utilisation of Grid resources.
Besides static methods for scheduling such applications, there are also various methods that do dynamic scheduling, where not all of the tasks are assigned to all PEs initially. Gonz´alez-V´elez and Cole [GVC10] propose the statistical method for scheduling of divisible workloads, implemented in a task-farm algorithmic skeleton. Divisible workload is essentially a group of totally independent tasks, which conforms to the model of bag-of-tasks applications. The method proposed in their work, after an initial placement of a certain portion of application tasks to available PEs, dynamically, allocates the groups of tasks to PEs based on the performance of these PEs, and the variability of this performance.
A lot of research from mid 80s and the beginning of 90s dealt with the problem of scheduling the set of independent tasks to PEs, whereas assignment of tasks to PEs is done in packets. Packet can contain more than one task, and the main question was how many tasks to assign to a PE in one packet. We can see that this exactly corresponds to scheduling of bag-of-tasks applications. However, this research assumes the uniform communication latencies and computing capabilities of PEs in the com- puting environment. Static chunking (Hummel et al. [HSF92]) assigns all tasks at the beginning of the program execution. That is, if the initial number of tasks is n, andp
is a number of PEs in a system, then n/p tasks are assigned to each. Self-Scheduling (Hummel et al. [HSF92]) assigns one task to each PE as soon as it gets idle (similar to work-stealing). Fixed-Size Chunking (Kruskal and Weiss [KW85]) always assigns tasks in chunks of equal size (where by ’size’, it is meant the number of tasks in a chunk), and the optimal chunk size was determined to be (
√
2nh δp√lnp)
2/3, where h is the overhead in transferring a packet to the destination PE,δ is the standard deviation of task sizes, n is the number of tasks and p is the number of PEs. A lot of other, more complicated and computationally more expensive algorithms have also been proposed (e.g. Bold algorithm in Hagerup [Hag97]).