During the concretization process of the workflow, one essential step is the mapping of workflow tasks onto physical resources. The main goal here is to achieve the minimum makespan possible for the execution of the whole workflow. As such, characteristics of tasks (e.g. estimated runtime) and data artifacts (e.g. estimated size), as well as the availability and characteristics of physical resources play a major role during this mapping process. Based on the availability of resources, the resulting concrete
40
workflow may span across multiple sites (domains) of resources. Such multi-site resources may be made available to the usage of a specific workflow application perhaps through research collaboration among multiple partners or through a national/international computational infrastructure platform (e.g. XSEDE [3]). The key common attributes of such multi-site resources is the heterogeneity and dynamicity of the resources in terms of size and capability, as well as the lack of a centralized control mechanism.
We propose a framework that facilitates the decentralized orchestration of workflows that are mapped on multi-site resources. Through our decentralized orchestration scheme, orchestration of the whole workflow is achieved collaboratively by local workflow managers deployed at each site. The main advantages of orchestrating a workflow through our decentralized framework are the performance improvement (i.e. makespan reduction) and the preservation of the site autonomy. We provide more details regarding the advantages of our framework as we discuss specific workflow mapping/orchestration scenarios in the next Section.
3.1.1 Multi-site Workflow Mapping Scenarios
We provide three concrete scenarios where multiple sites of resources can be utilized for the mapping of a workflow. We discuss workflow orchestration alternatives available for each scenario. Then, we discuss advantages specific to each scenario through the usage of our orchestration decentralization framework.
Workflow mapping scenario 1: In this scenario, the user has access to a certain multi- site computational infrastructure (e.g. XSEDE [3]). In such an environment, user can make use of computational and data resources belonging to multiple administrative
41
domains accordingly with the proper allocations and policies made for him. The real merits of such platforms manifest themselves through the collaborative usage of hardware and software resources belonging to partner sites. To this end, they provide various middleware (e.g. Globus [25]) and software tools (e.g. GridFTP [42], Condor DAGMan [16]) to enable the communication and cooperation among partner sites.
Users can map their workflows on such an environment manually or automatically (e.g. Pegasus [17, 18]) to result in a mapping of workflow tasks that spans across multiple sites. Then, the user can utilize an existing centralized workflow orchestration tool (e.g. Condor DAGMan [16]) to control and manage the orchestration of the whole workflow. In such a scenario, the control, monitor, and management of all the individual tasks in the workflow reside with a single workflow orchestration tool instance located at one of the partner sites. This would result in each interaction between the workflow orchestration tool instance and a remote partner site, for the purpose of control and management of each task, to incur additional overheads. First overhead incurred is simply due to the communication overhead and delay between partner sites. Another type of overhead incurred is due to various middleware and software layer communication and processing delays that are encountered during the interactions.
Our proposal in such a scenario is the decentralized orchestration of the workflow rather than the centralized orchestration. Through our generic decentralization framework, we facilitate the orchestration of a workflow that span across multiple sites in decentralized manner. The main advantage of the decentralized orchestration is the improved performance, achieved via significant reductions in overheads
42
sustained through centralized orchestration. Another advantage is the preservation of site autonomy. By appointing a local workflow manager at each site responsible for the control and management of those tasks mapped on their own domain, each site can employ their own set of policies rather than a global set of policies enforced by the central workflow manager. These policies include, among others, task dispatch/monitoring policies, fault-recovery policies, and data management policies. Workflow mapping scenario 2: In this scenario, the user has access to both local
computational resources and to a multi-site computational infrastructure (e.g. XSEDE [3]). The user aims to utilize both sets of resources in combination for the execution of a certain workflow. In such a scenario, there is not an established middleware layer and software tools to allow the integration of both set of resources seamlessly. Accordingly, there is not an available workflow manager tool that is able to achieve the orchestration of a workflow in centralized manner over both sets of resources.
In a scenario as explained above, our goal is to facilitate the utilization of both sets of resources without having to employ any cumbersome proprietary mechanisms. To this end, we propose the orchestration of the workflow in a decentralized manner in such an environment. Following our generic decentralization framework, which operates at the application layer, a certain workflow specification can be transformed into multiple, individually executable workflow specifications. The required communication/synchronization interactions among those peer workflow orchestrations can be provided independently from any other system components, via basic standard mechanisms.
43
Workflow mapping scenario 3: This scenario is similar to the previous scenario. The user has access to multiple, independently administered sites of resources. In the previous scenario, one (or more) set of resources belong to an existing multi-site infrastructure (e.g. XSEDE [3]), thus providing the capabilities for centralized orchestration of a workflow in part; which is then combined with an additional set (or more) of resources for improved performance, or due to cost reasons. In this scenario, there are no established mechanisms/tools among any of the sites that would facilitate the centralized orchestration of a workflow.
The solution we propose for this scenario is again the utilization of our decentralized workflow orchestration framework. Following the same approach as explained in Scenario 2, multiple independent sites of resources can be made available for the orchestration of a workflow via basic standard mechanisms.