Task and Executor Batch Pair Matching and Ranking

Figure 7.5: Four possible compositions, all with the same MinSum score of 3.

may need to choose between multiple compositions which all contain a “single”

matching resource. Figure 7.2 is an example of this, where there are three possible compositions containing resources which can only be in one composition. Selecting any one of these will invalidate the other two. Again, it is beyond the scope of this work to discuss specific strategies for anything but the most basic ranking scenarios.

These different ranking strategies need to be evaluated in the context of real grid computing work loads. To date they have only been developed in theory.

7.6 Task and Executor Batch Pair Matching and Ranking

It is now possible to discuss a small real world example of M tasks matched to N executors, each specifying characteristics, requirements, and preferences. Listing 7.4 lists the tasks, and Listing 7.5 lists the executors. The “type” modifier on the value is omitted and some common base type for each dimension is assumed. Figure 7.6 illustrates the batch pair matching, with directed dotted lines indicating one-way matches, solid lines indicating pair-wise matches, and numbers indicating the preference order of a resource for the alternative pair-wise matches it can participate in.

< image > 300 </ image >

< exec > reco </ exec >

< prefs >

< image > 800 </ image >

< t h r e a d s > 6 </ t h r e a d s >

< image > 2048 </ image >

</ chars >

7.6 Task and Executor Batch Pair Matching and Ranking 159

Listing 7.4: A set of tasks seeking executors

< e x e c u t o r name = " x1 " >

< chars >

< cost > 15 </ cost >

< mhz > 3800 </ mhz >

< temp > 10000 </ temp >

< s t o r a g e > 500 </ s t o r a g e >

</ reqs >

Listing 7.5: A set of executors seeking tasks

Figure 7.6: An example of batched pair-wise matching, with 5 tasks and 4 executors.

The directed dotted lines indicate one-way matches, solid lines indicat-ing pair-wise matches, and numbers indicatindicat-ing the preference order of a resource for the alternative pair-wise matches it can participate in.

It is clear from this example that the matcher, even with the ranking information provided by preferences, will require its own arbitrary policy regarding how to select between t₃ and t₅ both of which have only x₄ as their possible match pair. Nonethe-less, this illustrates the flexibility and generic nature of resource descriptions and scheduling properties provided by this framework.

7.7 Summary 161

7.7 Summary

This chapter has discussed one mechanism for sub-selection of compositions from a set based on a preferences model which extends the characteristics and requirements models presented in earlier chapters. It identifies the difficulties of doing this in a general way and suggests that custom heuristics may often be used in practice in a matcher in order to perform composition sub-selection and ranking. A ranking algo-rithm is developed, as are two variations which accommodate multilateral ranking (that is, merging preferences from all sides of a match). The motivation has been to provide a REST mechanism by which candidate compositions can be gathered from multiple sources, and then evaluated collectively to select the most appropriate. An example of this may be a task allocation Agent which fetches executor descriptions from the local system, the departmental computing cluster, the university comput-ing cluster, and then from a selection of authorised external clusters. While all task resources managed by the Agent may match to all the executors, the preferences will provide a mechanism to identify the preferred assignment.

Having completed the description of the core features of GRDL, those being the models for characteristics, requirements, and preferences, it is now possible to consider some of the interesting features this model enables. This will be the topic of the following chapter.

Applications of the Grid Resource Description Language

This chapter discusses various applications of GRDL and possible exten-sions to the resource description facilities described in earlier chapters.

Composition contracts and resource templates are presented, which fa-cilitate scheduling and task management. GRDL validation based on XML Schema and DTDs is presented, as well as the mechanism for extensions to GRDL.

One of the objectives of GRDL is to provide the basis for a RESTful model of grid resources, allowing all resources within the grid infrastructure to make use of a common model for describing their state (i.e. representation). Another objective is to provide an efficient and effective mechanism for resource composition, specifically for allocating tasks to executors. These are the two main areas of discussion in this chapter, however other important topics around realising a RESTful grid are also discussed.

The system of resource templates, described in this chapter, provides the foun-dation for pooling of similar resources, based on a common template. This interacts with the idea of partial matching and partial templates, and provides a significant reduction in complexity of the otherwise NP-complete matching problem, given the degree of heterogeneity in computational grids, from a scheduling perspective, is small and “clumpy”. Tasks, in particular, follow this pattern, with groups of tasks all looking similar to each other, but inter-group variation being enormous. This results from users or VOs submitting numerous similar tasks, while different user groups have distinctly different usage patterns. On the executor side, ClassAd-style resource descriptions for the same physical hardware, which are generated each time the resource enters the “available” state, will largely appear the same. For example, over time small variations in available storage space or, in systems with

162

8.1 Composition Profiles and Contracts 163

advanced reservation, the size of the execution time slot may change, or the number of available nodes on a cluster, however other properties will likely remain fixed.

The commonality of resources is captured by templates such that resource repre-sentations are then registered in those pools which are “headed” by a compatible template. Composition can then be carried out via pool templates, rather than with each individual resource. As experience with DIRAC for LHCb established, there are many cases where the number of template pools M is much smaller than the number of resources N which need to be composed (M ≪ N), thus making the matching task computationally feasible.

8.1 Composition Profiles and Contracts

This section discusses a mechanism by which information concerning a particular composition of resources can be captured in a standard manner and utilised at later stages in the life cycle of the composition. In a batch system such a level of detail can be entirely implementation specific, as a single task manager will retain

“live” stateful information concerning active tasks and the policies which led to a particular utilisation of a resource. Within a grid environment, however, resource compositions transition between many independent systems through their life cycle so that information is not necessarily available or discernable by Services acting on the composition “down-stream” of the stage at which the composition decision was made. This is also true from the perspective that matchers have liberty to interpret dimensions, types, values, preferences, and transformations in their own ways. Users and system administrators often wonder why a particular task has not been scheduled to a particular computing resource, particularly when that resource has free “slots”. It is also a specific criticism of Condor that it is often unclear why a particular composition (match) has or has not been made, although this also stems from the complexity of the “Rank” expressions used within ClassAds (and the recent addition of the “analyzer” mode to Condor now provides more details of the matching decision). To address this in the RESTful model the results of an attempted composition can be reported using profiles.

This provides a richer level of detail concerning the composition than is given by a boolean result of success or failure for a given composition attempt. Profiles are formed from tuples of characteristic and resource sets from the resources in-volved in a composition. This requires introducing two new sets: Chars_match and Reqsunsatisfied. The first, Charsmatch, contains those characteristics from Ra which

match requirements in Rb, as defined in Equation 8.1. This set of characteristics is possibly only a partial match for the requirements of Rb, and can be compared to Equation 6.11 which, by contrast, describes a complete resource match. The second, Reqsunsatisfied, contains those requirements in Rb which are unsatisfied by Ra – that is, Ra contains no characteristics which satisfy the subset of Rb’s requirements in Reqsunsatisfied. This is defined in Equation 8.2. Described in this way, a resource match of Ra to Rb is successful if and only if Reqsunsatisfied =∅ (that is, there are no unsatisfied requirements so the set is empty). These relations are also described in Haskell in Appendix C.12.

Charsmatch , {ci|ci ∈ Ra.chars, rj ∈ Rb.reqs, ci ⊆ rj} (8.1)

Reqsunsatisfied , {rj|rj ∈ Rb.reqs, ∀ci ∈ Ra.chars, ci * rj} (8.2) When considering a matcher implementation, the resource matching operation is likely to terminate when the first unsatisfied requirement is found, therefore the Charsmatch and Reqsunsatisfied sets may not be complete. From a profiling and per-formance perspective, it is desirable to identify the requirements most likely to be unsatisfied and to test for them first, allowing the match to be terminated with the minimum number of resource/characteristic match tests. Clearly in all cases where the resource match is successful all requirements will be evaluated. Similarly, it is valuable to identify those characteristics which are most likely to satisfy require-ments and check them first, again minimising the number of resource/characteristic match tests.

With these two new sets, in addition to standard resource characteristic and requirement sets, it is possible to define three types of profiles:

Pminimal (minimal profile): Characteristics which satisfied requirements in a suc-cessful composition;

Pf ull (full profile): Characteristics and requirements of all resources participating in the composition;

P_{f ailure} (failure profile): Requirements which were unsatisfied in the composition.

Profiles only make sense when discussing compositions which are going to be or have already been realised, therefore preferences are not part of profiles.

The minimal profile P_minimal is valuable when the composition is realised and utilised, informing resources what specific quantities of interest have been agreed,

8.1 Composition Profiles and Contracts 165

acting as a service contract. It only contains the matching characteristic sets Charsmatch for the resources in a successful composition. It is defined for a pair match in Equation 8.3.

P_minimal , (Chars_match_a, Chars_match_b)| match(R_a, R_b) = true (8.3)

Full profiles provide a record of the state of resources at the time of composition.

This is valuable if the resources need to change their properties, and allows the effect of the change on the validity of the composition to be checked. The full profile may also supply information regarding the resources in the composition to stages which follow resource matching and scheduling. It consists of sub-tuples of characteristics and requirements for each resource participating in the composition. It is defined in Equation 8.4 for a pair composition. In both these cases the match() function is a place holder for the actual matching operation.

Pf ull, ((Charsa, Reqsa), (Charb, Reqsb))| match(Ra, Rb) = true (8.4)

Because a matcher has the freedom to transform and test requirements and characteristics in different ways, it is possible that two different matchers may form different compositions from a set of resources, or may utilise different characteristics in satisfying the requirements of compositions. For this reason it may be desireable to produce both a minimal profile, which captures the exact characteristics a par-ticular matcher selected to satisfy requirements, and a full profile which records the details of the resources in the composition. This is to say that a full profile does not imply exactly the same minimal profile for a composition in all cases – the minimal profile is dependent upon the behaviour of the matcher.

Failure profiles, Pf ailure, provide information regarding which requirements are not being satisfied, thus explaining the failure of a resource to form a particular composition. Failure profiles can also be used to optimise the requirement checking order. They are defined by Equation 8.5 for pair composition.

Pf ailure, (Reqsunsatisf ieda, Reqsunsatisf iedb)|match(Ra, Rb) = false (8.5)

8.2 Matching Transitivity and Templates for

In document A REST Model for High Throughput Scheduling in Computational Grids (Page 169-178)