Generic Grid Computing Tools
for Resource and Project Management
Erik Elmroth
Dept. of Computing Science & HPC2N
Umeå University, Sweden
Overall objectives
• Short term:
– Generic infrastructure components
for resource
& project management
– Interoperable, standards-based
• Long term:
– Grid-enabled & Grid-enabling tools
for scientific
computing
1 2 3 4 5
Grid Projects Overview
Generic Grid Computing Research – Multiproject
Job submission and resource brokering
– Standards-based, cross-middleware (ARC, LCG2, GT4) SweGrid Accounting System (SGAS) (with KTH, Sthlm)
– Included in Globus Toolkit 4 Grid-wide fairshare scheduling
– Hierarchical three-party QoS support (user, resource-owner, VO-authority) Grid interface-generation for numerical software libraries
– SLICOT-interfaces for NetSolve and web-portals High-level data re-replication systems (new)
Resource & project portal for SNIC
– HPC2N (coordinator), NSC, PDC – Portal interface and functionality – SNIC-wide database & security sol’ns
An Interoperable, Standards-based
Grid Resource Broker and Job
Submission Service
Contributions - Summary
• Web Service (GT4) based
job submission service
(JSS) and Grid resource broker
– Decentralized broker not assuming global control
• Based on
existing and emerging Grid standards
– JSDL, GLUE, WSAG, WSRF– Exchangeable modules
– Replaceable resource selection algorithms
• Interoperable
with multiple Grid middlewares
• Supports
– advance reservations
– benchmark-based estimation
of job duration
GT4 client ARC client Job Submission Module GT4 Resource ARC Resource LCG2 Resource LCG2 ClientJSS Architecture Overview
2. 3. 4. 4. 7. 5. 8. 9. 6. 9. 7. 8.Middleware Integration Points (cont.)
1. 2.
3.
Resource Selection Algorithms
• Earliest job completion = shortest Total Time to Delivery (TTD) TTD
TTD part: How to predict?
– File stage in - network bandwidth / user estimation – Wait for resource access - adv. reservation / load prediction – Application execution - benchmarks / user estimation – File stage out - network bandwidth / user estimation • Earliest possible job start
– File stage in and wait for access (same predictions as above)
Performance Evaluation
• Response time, including all overhead: brokering, interaction with
information services and resources – Five runs of 200 jobs each
– One client submitting one job at the time – Observed response time of 1.3 seconds per job
• Throughput:40 jobs/minute(multiple clients via single JSS)
Without advance reservations:
With advance reservations:
Current and Future Work
• Integration with
additional middlewares
• Extended
performance evaluation
– Performance evaluation of JSS against different
middlewares (ARC, GT4, LCG2)
• Add
coallocation
support
– Reuse main framework
Enforcing resource allocations with the
SweGrid Accounting System (SGAS)
joint work with Peter Gardfjäll, UmU Lennart Johnsson, KTH
Olle Mulmo, KTH Thomas Sandholm, KTH
SweGrid Accounting System (SGAS)
• Decentralized resource allocation enforcementsystem • SGAS performs soft real-time enforcementof allocations
–Real-time enforcement: Resources can, at the time of job submission, deny access if project quota has been used up –Soft: enforcement is subject to local resource policies
(strict enforcement not always appropriate) • Initially addressed allocation enforcement in SweGrid
–Notrestricted to SweGrid use
• Developed with an emphasis on easy integration into different Grid middleware
– Single-point-of-integration
– In SweGrid: deployed on top of NorduGrid middleware
Component interactions
1. Contact resource 2. Authenticate/authorize
(delegate credentials) 3. Submit job request 4. JARM intercepts request 5. Make account reservation 6. Run job
7. Collect usage info 8. Charge project account
and log usage info
Project information
• Please visit us at http://www.sgas.se
– SGAS download (version 2.0 available)
– Documentation
– Publications
• Mailing list: [email protected]
• Globus Toolkit contribution
A Decentralized System for
Grid-wide Fairshare Scheduling
joint work with Peter Gardfjäll, UmU
Fairshare
scheduling
• (Logical) division of resource capacity
– Users granted target shares
– Entitled portion of delivered utilization
• Scheduler adjusts job prio according to job owners' past usage
– job prio := f(target share, job submitter historical usage)
– History decay to increase impact of recent usage
• Goal: fairness over time
• We apply fairshare scheduling on a Grid-wide scale
– Share policiesthat (logically) divide aggregate Grid capacity
– Locally (on a resource) & globally (Grid-wide)
Resource allocation model –
share policies
Resource owner VO allocation authoritygrant Grid-wide share grant local share
subdivide share consume share VO user group FSGrid FairShareGrid system
• Establish and enforce share policies
– VO users are granted shares of aggregate Grid capacity
– Coordinates utilization across the Grid
Control degree
of contribution QoS guarantees
Control usage within group Coordinate VO utilization
Share policy illustration
Resource 1SweGrid (40%) NorduGrid (20%) Local users (40%)
SweGrid
Physics project (30%) Biology project (20%) Chemistry project (50%)
Group 1 (50%) Group 2 (50%)
NorduGrid
• Share policy enforcement
– Carried out locally by steering utilization towards target shares Local scope
Framework components
Workload manager Scheduler Priority calculator Priority calculator Policy engine Policy engine ResourceFairshare factor callout
Local policy Local usage DB Runtime Policy tree Runtime Policy tree VO-A usage data VO-A policy provider Policy reference job
Simulated Grid
• GridSim: discrete-event Grid
simulation toolkit
• SweGrid-like environment (6 x 100 CPUs)
• Each resource has a cluster scheduler
– Space-shared (one job per processor) – Non-preemptive
– Callout to determine FS priority factor for each job – Global view on utilization data refreshed once/min
• Workload
– Each user runs a stream of single-CPU, batch jobs • Contention for resources
1. Correctness
VO-B usage
25 30 35 40 45 50 55 60 65 70 75 0 50000 100000 150000 200000 250000 300000 350000 400000 Aggregated utilization (%) Time (s) P-B1 P-B2 Resource VO-A 30% VO-B 70% P-A1 50% P-A2 30% P-A3 20% P-B1 60% P-B2 40% U-B11 55% U-B12 30% U-B13 15%VO-B projects’ utilization
1. Correctness
P-B1 usage
10 20 30 40 50 60 70 Aggregated utilization (%) U-B11 U-B12 U-B13 Resource VO-A 30% VO-B 70% P-A1 50% P-A2 30% P-A3 20% P-B1 60% P-B2 40% U-B11 55% U-B12 30% U-B13 15%3. Imbalanced
workload
15 20 25 30 35 40 45 50 55 60 0 50000 100000 150000 200000 250000 300000 350000 400000 Aggregated utilization (%) Time (s) P-A1 P-A2 P-A3 0 10 20 30 40 50 60 70 80 0 50000 100000 150000 200000 250000 300000 350000 400000 Aggregated utilization (%) Time (s) P-A1 P-A2 P-A3Only local usage data Grid-wide usage data
Resource VO-A 30% VO-B 70% P-A1 50% P-A2 30% P-A3 20% P-B1 60% P-B2 40% U-B11 55% U-B12 30% U-B13 15% Conclusion:
– Grid-wide usage data important for global share enforcement
P-A2 and P-A3 only submit jobs to half of the resources
4. Subgroup
isolation
25 30 35 40 45 50 55 60 65 70 75 0 50000 100000 150000 200000 250000 300000 350000 400000 Aggregated utilization (%) Time (s) P-B1 P-B2 0 10 20 30 40 50 60 70 0 50000 100000 150000 200000 250000 300000 350000 400000 Aggregated utilization (%) Time (s) U-B11 U-B12 U-B13 Resource VO-A 30% VO-B 70% P-A1 50% P-A2 30% P-A3 20% P-B1 60% P-B2 40% U-B11 55% U-B12 30% U-B13 15%Sibling shares Parent shares
• Conclusion
– Performs subgroup isolation
– Idle share made available to (and only to) active sibling entries
Resource and project portal
joint work with Mats Nylén, Roger Oscarsson, UmU (additional parts jointly with PDC and NSC)
Grid Portal Development
Common easy-to-use interface to a diverse set of heterogeneous systems (Grids or specific computers)
Features (on-going work):
• Access a general Grid or individual resources • Single sign-on
• Submit Grid/batch jobs • Monitor/delete jobs
• Integrated information services • View output
• Use system commands • File transfer
• Archive/retrieve data • Manage accounts
Recent Grid Computing Publications (2005)
• E. Elmroth, M. Nylén, and R. Oscarsson. A User-Centric Cluster and Grid ComputingPortal.International Journal of Computational Science and Engineering, 2005, (accept.) • E. Elmroth and J. Tordsson. An Interoperable Standards-based Grid Resource Broker
and Job Submission Service. e-Science 2005. First IEEE Conference on e-Science and Grid Computing, IEEE Computer Society Press, USA, 2005, pp. 212-220, 2005. • E. Elmroth and P. Gardfjäll. Design and Evaluation of a Decentralized System for
Grid-wide Fairshare Scheduling.e-Science 2005. First IEEE Conference on e-Science and Grid Computing, IEEE Computer Society Press, USA, 2005, pp. 221-229, 2005. • E. Elmroth, P. Gardfjäll, and J. Tordsson. An Advanced Grid Computing Course for
Application and Infrastructure Developers.CCGrid05, IEEE Computer Society Press, USA, 2005, pp. 43-50, 2005.
• E. Elmroth and R. Skelander. Semi-automatic generation of Grid computing interfaces for numerical software libraries.State-of-the-art in Scientific Computing. Springer-Verlag, Lecture Notes in Computer Science, Vol. 3732, 2005, pp. 404-412, 2005. • E. Elmroth, P. Gardfjäll, O. Mulmo, and T. Sandholm. An OGSA-based Bank Service for
Grid Accounting Systems.State-of-the-art in Scientific Computing.Springer-Verlag, Lecture Notes in Computer Science, Vol. 3732, pp. 1051-1060, 2005.
• E. Elmroth and J. Tordsson. A Grid Resource Broker Supporting Advance Reservations and Benchmark-based Resource Selection.State-of-the-art in Scientific Computing. Springer-Verlag, Lecture Notes in Computer Science, Vol. 3732, pp. 1061-1070, 2005. • T. Sandholm, P. Gardfjäll, E. Elmroth, L. Johnsson, and O. Mulmo. A Service-Oriented Approach to Enforce Grid Resource Allocations. (Submitted for Journal publication.)