A N application consisting of a group of tasks can be rep- resented by a node- and edge-weighted directed acyclic graph (DAG), in which the vertices represent the computations and the directed edges represent the data dependencies as well as the communication times between the vertices. DAGs have been shown to be expressive for a large number of and a variety of applications. Taskscheduling is one of the most thought-provoking NP-hard problems in general cases, and polynomial time algorithms are known only for a few restricted cases . Hence, it is a challenge on heterogeneouscomputingsystems to develop taskscheduling algorithms that assign the tasks of an application to processors in order to minimize makespan without violating precedence constraints. Therefore, many researchers have proposed a variety of approaches to solving the DAG taskscheduling problem. These methods are basically classified into two major categories: dynamic scheduling and static scheduling. In dynamic scheduling, the information, such as a task’s relation, execution time, and communication time, are all not previously known. The sched- uler has to make decisions in real time. In static scheduling, all information about tasks are known before hand. Static scheduling algorithms by using different techniques to find a near optimal solution are universally classified into two major groups: heuristic scheduling and meta-heuristic scheduling.
priority they are added to the list of task which is waiting for their turns in order to decreasing priority. Second phase, when the different processors becomes available for serving more, the highest priority subtask is selected from the list and assigned to the processor which is most suitable. So, here searching is narrowed down to small portions of solution space by using some other heuristics; greedy heuristics, thus it gives a better makespan. A drawback is that is not suitable for giving consistent results for heterogeneouscomputingsystems.
Over the past few decades, research efforts are mainly focused on the problem of taskscheduling on algorithms running on homogenous and heterogeneoussystems mainly with the objective of reducing the overall execution time of the tasks. Topcuoglu et al.  have presented HEFT and CPOP scheduling algorithms for heterogonous processors. Luiz et al.  have developed lookahead-HEFT algorithm, which look ahead in the schedule to make scheduling decisions. Eswari, R. and Nickolas, S.  have proposed PHTS algorithm to efficiently schedule tasks on the heterogeneous distributed computingsystems. Rajak and Ranjit  have presented a queue based scheduling algorithm called TSB to schedule tasks on homogeneous parallel multiprocessor system. Ahmed, S.G.; Munir, E.U.; and Nisar, W.  have developed genetic algorithm called PEGA that provide low time complexity than standard genetic algorithm (SGA). Xiaoyong Tang; Kenli Li; Renfa Li; and Guiping Liao  have presented a list- scheduling algorithm called HEFD for heterogeneouscomputingsystems. Nasri, W. and Nafti, W.  have developed a new DAG scheduling algorithm for heterogeneoussystems that provide better performance than some well-known existing scheduling algorithms.
Hamid Arabnejad and Jorge G. Barbosa  proposed a novel approach called Predict Earliest Finish Time (PEFT) for heterogeneouscomputingsystems. This algorithm has the same time complexity that is, O (v2: p) for v tasks and p processors, introducing a feature without increasing the time complexity connected with computation of an Optimistic Cost Table (OCT). The designed value is an optimistic cost because processor availability is not measured in the computation. This algorithm is only based on an OCT table that is used to rank tasks and for processor selection. PEFT algorithm performs list-based algorithms for heterogeneoussystems in terms of schedule length ratio, efficiency and frequency.
The efficient scheduling of independent computational jobs in a heterogeneouscomputing (HC) environment such as a computational grid is clearly important if good use is to be made of a valuable resource. Scheduling algorithms can be used in such a system for several different requirements . The first, and most common, is for planning an efficient schedule for some set of jobs that are to be run at some time in the future, and to work out if sufficient time or computational resources are available to complete the run a priori. Static scheduling may also be useful for analysis of heterogeneouscomputingsystems, to work out the effect that losing (or gaining) a particular piece of hardware, or some sub network of a grid for example, will have. Static scheduling techniques can also be used to evaluate the performance of a dynamic scheduling system after it has run, to check how effectively the system is using the resources available. The Ant Colony Optimization (ACO) meta-heuristic was first described in as a technique to solve the traveling salesman problem, and was inspired by the ability of real ant colonies to efficiently organize the foraging behavior of the colony using external chemical pheromone trails as a means of communication. ACO algorithms have since been widely employed on many other combinatorial optimization problems including several domains related to the problem in hand, such as bin packing and job shop scheduling, but ACO has not previously been applied to finding good job schedules in an HC environment. Although various classification of taskscheduling strategies exist, depending on the type of the grid, the scheduler organization and taskscheduling strategy has to be chosen such that the resources in the grid are effectively utilized. Along with the scheduling organization, state estimation is also necessary for the dynamic grid. To achieve maximum resource utilization and high job throughput, re-scheduling of the jobs on different resources is also sometimes required.
Cloud computing conventionally uses to provide infrastructure, platforms, software and data as a service. It offers three service model Saas, Paas and Iass. computing technology is a new way in Cloud Computing .for that uses the central remote servers and Internet to maintain data and application. A lot of virtual machines(VM) will persist to run concurrently in the cloud, when a testing machine is overloaded, cloud computing dynamically transfers its load into a number of virtual machines. Migration is defined as the process of transferring the virtual machine from a physical machine to the load. Cloud provides two services over the public and private network.
Static TS has proven to be NP-complete, even for the ho- mogeneous case. Therefore, research efforts in this field have been mainly focused on obtaining low-complexity heuristics that produce good schedules , which is the topic of this paper. Although this problem has been extensively studied in the past, first all the related State of the Art (SotA) algorithms assume the computation costs in the DAG are available a priori, ignoring the fact that the time needed to run/simulate all these tasks is orders of magnitude higher than finding a good quality schedule; this is because the number of simulations/runs required is very large especially for heterogeneoussystems where different execution time values occur among different processors. Second, SotA TS heuristics consider application tasks as single thread implementations only, but in practice application tasks are normally split into multiple threads.
ii P. Chitra, R. Rajaram, P. Venkatesh, Application and comparison of hybrid evolutionary multiobjective optimization algorithms for solving taskscheduling problem on heterogeneoussystems, Applied Soft Computing 11 (2011) 2725–2734 iii E. Ilavarasan, P. Thambidurai, R. Mahilmannan, Performance effective task
ABSTRACT: Scheduling is the process which improves the performance of parallel and distributed systems. Multiprocessor is a powerful computing for real-time applications and their high performance is purely based on parallel and distributed systems. Number of scheduling tasks in homogeneous and heterogeneous multiprocessor systems is an important problem in computing because this problem is a NP-hard problem.The execution time for individual tasks in a network are specified in a vector and refer only to the computation time. To increase the performance, reducing the processing time of the program is the main aim of scheduling. In this process we are computing rank for all nodes starting from exit node. An advanced taskscheduling in heterogeneous multiprocessor by using P-HEFT Algorithm is proposed to minimize the execution time and increase the processor utilization and load balancing for more no of tasks.
The paper presented a new algorithm called NHEFT for scheduling graphs onto a system of heterogeneous processors. Experimental work shows that the NHEFT algorithm significantly outperformed the other algorithms like MH, DLS, LMT and HEFT. Because of its robust performance and low running time, the NHEFT algorithm is a viable solution for the DAG scheduling problem on heterogeneoussystems. The NHEFT algorithm can be extended in future for rescheduling task in response to changes in processor and network loads. Although given algorithm assume a fully connected network. It is also extend this algorithm for arbitrary-connected networks.
Divers portions of an application task often require different types of computation. In general, it is impossible for a single machine architecture with its associated compiler, operating systems and programming tools to satisfy all the computational requirements in such an application equally well. Recent developments in high-speed digital communication have made it possible to connect a distributed suite of different high performance machines in order provide a powerful computing platform called Heterogeneous Distributed Computing System (HeDCS). This platform is utilized to execute computationally intensive applications that have diverse computing requirements. However, the performance of parallel applications on such systems is highly dependent on the scheduling of the application tasks onto these machines [1, 2]. Taskscheduling [3, 4] is of vital importance in HeDCS since a poor task-scheduling algorithm can undo any potential gains from the parallelism presented in the application. In general, the objective of taskscheduling is to minimize the completion time of a parallel application by properly mapping the tasks to the processors. There are typically two categories of scheduling models: static and dynamic scheduling. In the static scheduling case, all the information regarding the application and computing resources such as execution time, communication cost, data dependency, and synchronization requirement is assumed available a priori. Scheduling is performed before the actual execution of the application [5, 6, 7]. On the other hand,
In unrelated machines where the infrastructure is heterogeneous,  argues that congestion does not take place since the load of a given task is different on different machines. This model does not work in a shared infrastructure model as a job is completed if and only if all of its tasks have been completed. Based on this, unrelated machines cannot be treated any different than related machines. A shared resource infrastructure breaks down to its components of individual resources that need to work together to complete a set of tasks for a user. In short, there is an intrinsic commonality between related and unrelated resources in a shared environment. This commonality is the fact that each resource can contribute to the overall completion of a job. There has also been research done around truthful algorithms as it pertains to scheduling tasks [44, 45]. In shared environments where there is little monetary incentive to be truthful about requirements, these algorithms do not work. In task-based systems, it is also very difficult to calculate the exact resource requirement as the tasks are each short in duration.
In this paper, a new Highest Communicated Path of Task (HCPT) algorithm is presented for heterogeneous distributed computingsystems (HDCS). This algorithm based on Rank value to give a priority to each task. According to the simulation results, it is found that the HCPT algorithm is better than ECTS, PHTS andCPOP algorithms in terms of schedule length, speedup and efficiency. Performance improvement ratio in schedule length, speedup and efficiency respectively are 16.5%, 15.85% and 16.4%. The HCPT algorithm can be tested on real applications and the development can be made on efficiency. Task duplication can be added also as a future work to increase the efficiency of the algorithm. The HCPT can apply on directed cyclic graph as a future work.
Static load balancing policies are generally based on the information about the average behavior of system; transfer decisions are independent of the actual current system state. Static load balancing schemes use a priori knowledge of the applications and statistical information about the system In static load balancing, the performance of the processors is determined at the beginning of execution. Then depending upon their performance the work load is assigned by the master processor. The slave processors calculate their allocated work and submit their result to the master. A task is always executed on the processor to which it is assigned that is static load balancing methods are non preemptive. The goal of static load balancing method is to reduce the execution time, minimizing the communication delays.
gan et. al. has proposed genetic simulated annealing based algorithms for taskscheduling in cloud environment. In this proposed algorithm they have emerged the features of simulated annealing with genetic algorithms with consideration of QoS parameters to efficiently access resource and allocation in the cloud. For this purpose they use the basic steps of genetic algorithm like initial population creation, crossover, mutation and the result of this process is provided to annealing module to get the optimum result for resource allocation. In the field of cloud computing two type of scheduling has been performed taskscheduling and resource scheduling. Diptangshuet. al.  has proposed simulated annealing based algorithm for the resource management usage of multi parameters to get the optimum result. This algorithm is mainly designed to position of multilayer in the cloud environment. It gives better result as compared to commonly used algorithm FCFS. It gives the near to optimum solution by setting the constraints in the form of hard and soft, which assure that in every iteration cost goes reduced.
Figure 4 shows normalized solutions obtained by parallel and sequential algorithm for bigram word clustering, respectively. Time τ indicates the point, where the solution obtained by the parallel version of the program is equal to the solution obtained by the sequential version. At the beginning of the program execution, parallel version needs some time for an initialization of the computing system and data transfer. There- fore, before the time τ , which is not known a pri- ori, the solution obtained by the sequential ver- sion is better than the solution obtained by the parallel version. In our experiment we wanted to determine the value of the time τ , when both solutions were close to each other. The value of the time τ is obtained by the averages of 20 independent runs of both parallel and sequential program versions: 130 s < τ < 160 s.
Genetic algorithm is based on biological concept of generation of the population, a rapid growing area of Artificial intelligence. GA’s are inspired by Darwin’s theory about Evolution. According to the Darwin “Survival of the fittest”. It also a used as the method of scheduling in which the tasks are assigned resources according schedules in context of scheduling, which tells about which resource is to be assigned to which task. Genetic Algorithm is based on the biological concept of population generation. In  the genetic algorithm as a heuristic method for search an optimized solution in a large space of solutions, has proposed. At first step, random initializing of a chromosome population is performed for a certain duty. Each chromosome has a fitness value (makespan). Results of schedulingTask for machines are saved in a chromosome. After first population is produced, all chromosomes in the population evaluate themselves with the case based on the fitness value. In this evaluation having little makespan is a better mapping. In , another scheduling algorithm is introduced based on combining methods for scheduling tasks in the cloud. In order to reach utilization and efficiency of maximum results, this algorithm simultaneously concentrates on tasks and resources to reach a general optimization. In this method, independent and dividable tasks which require different computation and different memories are scheduled by a genetic algorithm efficiently. In this method, it is assumed that the cloud system is heterogeneous. In other words, all resources of processes and communications are performed heterogeneously. Therefore, by considering memorial limitations and the most requests for cloud computations efficiency, the proposed method is attempting to provide a scheduling method based on GA.
of feasible solutions after the user defines the deadline and budget constraints. The main objective is to let the user choose the best solution from multiple possible so- lutions. In , Sakellariou et al. implement an algo- rithm to schedule DAGs on heterogeneous machines under budget constraints. The algorithm computes the weight value of each task and machine via two ap- proaches and then uses the weights to assign tasks to machines in consideration of the cost and budget. Using the concept of game theory and sequential coopera- tive game, Duan et al. provide two algorithms (game- quick and game-cost) to optimize performance and cost in . In addition, they design and implement a novel system model with better controllability and predictability of multi-workflow optimization prob- lems in a grid environment. Chen et al. proposed an ant colony optimization algorithm to schedule large- scale workflows with three QoS parameters in a grid computing scenario in . The algorithm enables users to specify two of the constraints and finds an optimized solution for the third constraint while meeting these parameters. These various algorithms are designed for users in a distributed environment, and the cost is usually based on all services used.
As cloud computing technique frees the user from the overhead cost of hardware, but still some cost factors are always involved and these cost factors are is comparatively very low as they are charged according to the services requested by the end user. For example, if a user requests for any task, then the cost is charged according to the resource required for accomplish on of the task, time of acquisition, turnaround time, I/O cost, the cost of resources etc. . As each task is totally different from the other task so it is required to compute the cost of every individual task uniquely when it is requested. Different task results in the different cost factor.
Rajiv Ranjan and Rajkumar Buyya Hosting Internet-based application services. These applications have different composition, configuration, and deployment requirements. The simulation framework has the following novel features: (i)support for modelling and instantiation of large scale Cloud computing infrastructure, including data centers on a single physical computing node and java virtual machine; (ii) a self-contained platform for modelling data centers, service brokers, scheduling, and allocations policies; (iii) availability of virtualization engine, which aids in creation and management of multiple, independent, and co-hosted virtualized services on a data center node.