3.5 Conclusion
4.1.2 Resource allocation for MapReduce workflows: a basic ap-
Now, consider a more complex application defined as a workflow that consists of
N jobs: W ={J1, J2, ...JN}with a given completion time goalD. The problem is is
to estimate a required resource allocation (a number of map and reduce slots) that enables the workflowW to be completed with the (soft) deadlineD.
First of all, there are multiple possible resource allocations that could lead to a desirable performance goal. We could have picked a set of intermediate completion times Di for each job Ji from the set W = {J1, J2, ...JN} such that D1 +D2 +...+DN ≤ D , and then determine the number of map and reduce
slots required for each job Ji to finish its processing within Di. However, such
a solution would be difficult to implement and manage by the scheduler. When each job in a DAG requires a different allocation of map and reduce tasks then it is difficult to reserve and guarantee the timely availability of the required resources. A simpler and more elegant solution would be to determine a specially tailored
resource allocation of map and reduce slots(SW
M, SRW)to be allocated to the entire
workflowW (i.e., to each its jobJi,1≤ i ≤N) such thatW would finish within a
given deadlineD. We called it thebasicresource allocation approach.
There are a few design choices for determining the required resource allocation for a given MapReduce workflow. These choices are driven by the bound-based performance models designed in Chapter 3.2.2:
• Determine the resource allocation when deadline D is targeted as a lower bound of the workflow completion time. Typically, this leads to the least amount of resources that are allocated to the workflow for finishing within deadlineD. The lower bound on the completion time corresponds to “ide- al” computation under allocated resources and is rarely achievable in real environments.
• Determine the resource allocation when deadline D is targeted as an upper boundof the workflow completion time. This would lead to a more aggres- sive resource allocations and might result in a workflow completion time that is much smaller (better) thanDbecause worst case scenarios are also rare in production settings.
• Finally, we can determine the resource allocation when deadlineDis targeted as theaveragebetween lower and upper bounds on the workflow completion time. This solution provides a balanced resource allocation that is closer for achieving the workflow completion timeD.
For example, when Dis targeted as alower boundof the workflow completion time, we need to solve the following equation for an appropriate pair(SMW, SRW)of map and reduce slots:
X 1≤i≤N TJlow i (S W M, S W R ) =D (4.3)
By using the Lagrange’s multipliers method as described in [66], we determine the minimum amount of resources (i.e. a pair of map and reduce slots (SW
4.1. Deadline-driven resource allocation on shared Hadoop cluster
results in the minimum sum of the map and reduce slots) that needs to be allocated toW for completing with a given deadlineD.
Solution whenDis targeted as anupper boundor anaveragebetween lower and upper bounds of the workflow completion time can be found in a similar way.
Evaluating the basic approach in supporting deadline-driven applications
We evaluate the accuracy of the basic approach in estimating the appropriate re- source allocation for a MapReduce workflow with completion time requirement using the same PigMix benchmark and TPC-H and proxy query set described in Chapter 3.3.2 as well as the testing workloads. The experiments are performed on the same testbed as we described in Chapter 3.1.4.
We first evaluate the approach with the PigMix benchmark. In this set of ex- periments, letT denote the completion time when it is processed with maximum available cluster resources (i.e., when the entire cluster is used for processing). We setD= 3·T as a completion time goal. Using the Lagrange multipliers’ approach (described in Chapter 4.1.2) we compute the required resource allocation, i.e., a fraction of cluster resources, a tailored number of map and reduce slots that allow the workflow to be completed with deadlineD. As discussed in Chapter 4.1.2, we can compute a resource allocation whenDis targeted as either a lower bound, or upper bound or the average of lower and upper bounds on the completion time. Figure 4.2 shows the measured workflow completion times based on these three different resource allocations. Similar to our earlier results, for presentation pur- poses, wenormalizethe achieved completion times with respect to deadlineD.
In most cases, the resource allocation that targetsDas a lower bound is insuffi- cient for meeting the targeted deadline (e.g., the L17 program misses deadline by more than 20%). However, when we compute the resource allocation based onD
as an upper bound – we are always able to meet the required deadline, but in most cases, we over-provision resources, e.g., L16 and L17 finish more than 20% earli- er than a given deadline. The resource allocations based on the average between
lower and upper bounds result in the closest completion time to the targeted dead- lines. 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 L1 L2 L3 L4 L5 L6 L7 L8 L9 L10 L11 L12 L13 L14 L15 L16 L17
Program Completion Time
Tlow-based
Tavg-based
Tup-based
Deadline
Figure 4.2: PigMix executed with the estimated resources: do we meet deadlines?
The basic approach proves to be effective for the MapReduce workflows gen- erated from the PigMix benchmark. However, as most of the queries from PigMix are compiled into sequential MapReduce workflow, it is not clear whether the pro- posed approach works well for workflows with concurrent jobs. To understand the performance of the approach for workflows with concurrent jobs. We perfor- mance the similar experiments on the two other workloads: TPC-H and Proxy query set.
Figure 4.3 presents the results for these two workloads. While each of the three considered resource allocations is meeting the desired deadline, we observe that the basic approach is inaccurate for programs with concurrent jobs. There is sig- nificant resource over-provisioning: the considered workflows finish much earlier (up to 50% earlier) than the targeted deadlines.
In summary, while the basic approach produces good results for workflows with sequential MapReduce jobs, it over-estimates a completion time of workflows with concurrent jobs, and leads to over-provisioned resource allocations for work- flows with concurrent jobs. The reason of the inaccuracy also comes from the exe- cution overlaps among the concurrent jobs. As we discussed in Chapter 3.3.3. The pipelined execution of concurrent jobs in workflow W may significantly reduce
4.1. Deadline-driven resource allocation on shared Hadoop cluster 0 0.2 0.4 0.6 0.8 1 1.2 1.4 Q5 Q8 Q10
Program Completion Time
Tlow-based Tavg-based Tup-based Deadline (a) TPC-H 0 0.2 0.4 0.6 0.8 1 1.2 1.4 Q1 Q2 Q3
Program Completion Time
Tlow-based Tavg-based Tup-based Deadline
(b) Proxy Queries
Figure 4.3: TPC-H/Proxy queries with the estimated resources: do we meet dead- lines?
the program completion time. Therefore, W may need to be assigned a smaller amount of resources for meeting the same deadlineD.
In the following part, we will first present an important observation we found that the execution order of the concurrent jobs could significantly affect the work- flow completion time and propose a scheduling algorithm for optimizing the com- pletion time based on the observation. We then refine the proposed approach for estimating the resource allocation of such optimized MapReduce workflows in meeting their deadlines.