Generic Grid Computing Tools for Resource and Project Management

(1)

Generic Grid Computing Tools

for Resource and Project Management

Erik Elmroth

Dept. of Computing Science & HPC2N

Umeå University, Sweden

Overall objectives

• Short term:

– Generic infrastructure components

for resource

& project management

– Interoperable, standards-based

• Long term:

– Grid-enabled & Grid-enabling tools

for scientific

computing

(2)

1 2 3 4 5

Grid Projects Overview

Generic Grid Computing Research – Multiproject

Job submission and resource brokering

– Standards-based, cross-middleware (ARC, LCG2, GT4) SweGrid Accounting System (SGAS) (with KTH, Sthlm)

– Included in Globus Toolkit 4 Grid-wide fairshare scheduling

– Hierarchical three-party QoS support (user, resource-owner, VO-authority) Grid interface-generation for numerical software libraries

– SLICOT-interfaces for NetSolve and web-portals High-level data re-replication systems (new)

Resource & project portal for SNIC

– HPC2N (coordinator), NSC, PDC – Portal interface and functionality – SNIC-wide database & security sol’ns

An Interoperable, Standards-based

Grid Resource Broker and Job

Submission Service

(3)

Contributions - Summary

• Web Service (GT4) based

job submission service

(JSS) and Grid resource broker

– Decentralized broker not assuming global control

• Based on

existing and emerging Grid standards

– JSDL, GLUE, WSAG, WSRF

– Exchangeable modules

– Replaceable resource selection algorithms

• Interoperable

with multiple Grid middlewares

• Supports

– advance reservations

– benchmark-based estimation

of job duration

GT4 client ARC client Job Submission Module GT4 Resource ARC Resource LCG2 Resource LCG2 Client

JSS Architecture Overview

2. 3. 4. 4. 7. 5. 8. 9. 6. 9. 7. 8.

(4)

Middleware Integration Points (cont.)

1. 2.

3.

Resource Selection Algorithms

• Earliest job completion = shortest Total Time to Delivery (TTD) TTD

TTD part: How to predict?

– File stage in - network bandwidth / user estimation – Wait for resource access - adv. reservation / load prediction – Application execution - benchmarks / user estimation – File stage out - network bandwidth / user estimation • Earliest possible job start

– File stage in and wait for access (same predictions as above)

(5)

Performance Evaluation

• Response time, including all overhead: brokering, interaction with

information services and resources – Five runs of 200 jobs each

– One client submitting one job at the time – Observed response time of 1.3 seconds per job

• Throughput:40 jobs/minute(multiple clients via single JSS)

Without advance reservations:

With advance reservations:

Current and Future Work

• Integration with

additional middlewares

• Extended

performance evaluation

– Performance evaluation of JSS against different

middlewares (ARC, GT4, LCG2)

• Add

coallocation

support

– Reuse main framework

(6)

Enforcing resource allocations with the

SweGrid Accounting System (SGAS)

joint work with Peter Gardfjäll, UmU Lennart Johnsson, KTH

Olle Mulmo, KTH Thomas Sandholm, KTH

SweGrid Accounting System (SGAS)

• Decentralized resource allocation enforcementsystem • SGAS performs soft real-time enforcementof allocations

–Real-time enforcement: Resources can, at the time of job submission, deny access if project quota has been used up –Soft: enforcement is subject to local resource policies

(strict enforcement not always appropriate) • Initially addressed allocation enforcement in SweGrid

–Notrestricted to SweGrid use

• Developed with an emphasis on easy integration into different Grid middleware

– Single-point-of-integration

– In SweGrid: deployed on top of NorduGrid middleware

(7)

Component interactions

1. Contact resource 2. Authenticate/authorize

(delegate credentials) 3. Submit job request 4. JARM intercepts request 5. Make account reservation 6. Run job

7. Collect usage info 8. Charge project account

and log usage info

Project information

• Please visit us at http://www.sgas.se

– SGAS download (version 2.0 available)

– Documentation

– Publications

• Mailing list: [email protected]

• Globus Toolkit contribution

(8)

A Decentralized System for

Grid-wide Fairshare Scheduling

joint work with Peter Gardfjäll, UmU

Fairshare

scheduling

• (Logical) division of resource capacity

– Users granted target shares

– Entitled portion of delivered utilization

• Scheduler adjusts job prio according to job owners' past usage

– job prio := f(target share, job submitter historical usage)

– History decay to increase impact of recent usage

• Goal: fairness over time

• We apply fairshare scheduling on a Grid-wide scale

– Share policiesthat (logically) divide aggregate Grid capacity

– Locally (on a resource) & globally (Grid-wide)

(9)

Resource allocation model –

share policies

Resource owner VO allocation authority

grant Grid-wide share grant local share

subdivide share consume share VO user group FSGrid FairShareGrid system

• Establish and enforce share policies

– VO users are granted shares of aggregate Grid capacity

– Coordinates utilization across the Grid

Control degree

of contribution _{QoS guarantees}

Control usage within group Coordinate VO utilization

Share policy illustration

Resource 1

SweGrid (40%) NorduGrid (20%) Local users (40%)

SweGrid

Physics project (30%) Biology project (20%) Chemistry project (50%)

Group 1 (50%) Group 2 (50%)

NorduGrid

• Share policy enforcement

– Carried out locally by steering utilization towards target shares Local scope

(10)

Framework components

Workload manager Scheduler Priority calculator Priority calculator Policy engine Policy engine Resource

Fairshare factor callout

Local policy Local usage DB Runtime Policy tree Runtime Policy tree VO-A usage data VO-A policy provider Policy reference job

Simulated Grid

• GridSim: discrete-event Grid

simulation toolkit

• SweGrid-like environment (6 x 100 CPUs)

• Each resource has a cluster scheduler

– Space-shared (one job per processor) – Non-preemptive

– Callout to determine FS priority factor for each job – Global view on utilization data refreshed once/min

• Workload

– Each user runs a stream of single-CPU, batch jobs • Contention for resources

(11)

1. Correctness

VO-B usage

25 30 35 40 45 50 55 60 65 70 75 0 50000 100000 150000 200000 250000 300000 350000 400000 Aggregated utilization (%) Time (s) P-B1 P-B2 Resource VO-A 30% VO-B 70% P-A1 50% P-A2 30% P-A3 20% P-B1 60% P-B2 40% U-B11 55% U-B12 30% U-B13 15%

VO-B projects’ utilization

1. Correctness

P-B1 usage

10 20 30 40 50 60 70 Aggregated utilization (%) U-B11 U-B12 U-B13 Resource VO-A 30% VO-B 70% P-A1 50% P-A2 30% P-A3 20% P-B1 60% P-B2 40% U-B11 55% U-B12 30% U-B13 15%

(12)

3. Imbalanced

workload

15 20 25 30 35 40 45 50 55 60 0 50000 100000 150000 200000 250000 300000 350000 400000 Aggregated utilization (%) Time (s) P-A1 P-A2 P-A3 0 10 20 30 40 50 60 70 80 0 50000 100000 150000 200000 250000 300000 350000 400000 Aggregated utilization (%) Time (s) P-A1 P-A2 P-A3

Only local usage data _{Grid-wide usage data}

Resource VO-A 30% VO-B 70% P-A1 50% P-A2 30% P-A3 20% P-B1 60% P-B2 40% U-B11 55% U-B12 30% U-B13 15% Conclusion:

– Grid-wide usage data important for global share enforcement

P-A2 and P-A3 only submit jobs to half of the resources

4. Subgroup

isolation

25 30 35 40 45 50 55 60 65 70 75 0 50000 100000 150000 200000 250000 300000 350000 400000 Aggregated utilization (%) Time (s) P-B1 P-B2 0 10 20 30 40 50 60 70 0 50000 100000 150000 200000 250000 300000 350000 400000 Aggregated utilization (%) Time (s) U-B11 U-B12 U-B13 Resource VO-A 30% VO-B 70% P-A1 50% P-A2 30% P-A3 20% P-B1 60% P-B2 40% U-B11 55% U-B12 30% U-B13 15%

Sibling shares Parent shares

• Conclusion

– Performs subgroup isolation

– Idle share made available to (and only to) active sibling entries

(13)

Resource and project portal

joint work with Mats Nylén, Roger Oscarsson, UmU (additional parts jointly with PDC and NSC)

Grid Portal Development

Common easy-to-use interface to a diverse set of heterogeneous systems (Grids or specific computers)

Features (on-going work):

• Access a general Grid or individual resources • Single sign-on

• Submit Grid/batch jobs • Monitor/delete jobs

• Integrated information services • View output

• Use system commands • File transfer

• Archive/retrieve data • Manage accounts

(14)

Recent Grid Computing Publications (2005)

• E. Elmroth, M. Nylén, and R. Oscarsson. A User-Centric Cluster and Grid Computing

Portal.International Journal of Computational Science and Engineering, 2005, (accept.) • E. Elmroth and J. Tordsson. An Interoperable Standards-based Grid Resource Broker

and Job Submission Service. e-Science 2005. First IEEE Conference on e-Science and Grid Computing, IEEE Computer Society Press, USA, 2005, pp. 212-220, 2005. • E. Elmroth and P. Gardfjäll. Design and Evaluation of a Decentralized System for

Grid-wide Fairshare Scheduling.e-Science 2005. First IEEE Conference on e-Science and Grid Computing, IEEE Computer Society Press, USA, 2005, pp. 221-229, 2005. • E. Elmroth, P. Gardfjäll, and J. Tordsson. An Advanced Grid Computing Course for

Application and Infrastructure Developers.CCGrid05, IEEE Computer Society Press, USA, 2005, pp. 43-50, 2005.

• E. Elmroth and R. Skelander. Semi-automatic generation of Grid computing interfaces for numerical software libraries.State-of-the-art in Scientific Computing. Springer-Verlag, Lecture Notes in Computer Science, Vol. 3732, 2005, pp. 404-412, 2005. • E. Elmroth, P. Gardfjäll, O. Mulmo, and T. Sandholm. An OGSA-based Bank Service for

Grid Accounting Systems.State-of-the-art in Scientific Computing.Springer-Verlag, Lecture Notes in Computer Science, Vol. 3732, pp. 1051-1060, 2005.

• E. Elmroth and J. Tordsson. A Grid Resource Broker Supporting Advance Reservations and Benchmark-based Resource Selection.State-of-the-art in Scientific Computing. Springer-Verlag, Lecture Notes in Computer Science, Vol. 3732, pp. 1061-1070, 2005. • T. Sandholm, P. Gardfjäll, E. Elmroth, L. Johnsson, and O. Mulmo. A Service-Oriented Approach to Enforce Grid Resource Allocations. (Submitted for Journal publication.)