• No results found

Requesting Nodes, Processors, and Tasks in Moab

N/A
N/A
Protected

Academic year: 2021

Share "Requesting Nodes, Processors, and Tasks in Moab"

Copied!
5
0
0

Loading.... (view fulltext now)

Full text

(1)

LAWRENCE N AT I O N A L LABORATORY LIVERMORE

LLNL-MI-401783

Requesting Nodes, Processors,

and Tasks in Moab

D.A Lipari

(2)

This document was prepared as an account of work sponsored by an agency of the United States government. Neither the United States government nor Lawrence Livermore National Security, LLC, nor any of their employees makes any warranty, expressed or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its

endorsement, recommendation, or favoring by the United States government or Lawrence Livermore National Security, LLC. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States government or Lawrence Livermore National Security, LLC, and shall not be used for advertising or product endorsement purposes.

This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.

(3)

1

Introduction

This document explains the differences between requesting nodes, tasks, and processors for jobs running on LC Linux clusters. The discussion does not pertain to jobs running on any of the BlueGene clusters. For running jobs on BlueGene, see Running Jobs on BlueGene for a similar discussion.

In the discussion below, the terms “nodes,” “processors,” and “tasks” have special meaning for Moab. Nodes refer to the collection of processing resources associated with one IP address. On each node is a collection of sockets, with each socket containing one or more cores. For certain architectures, each core can run one or multiple hardware threads. The term processor refers to one core, whether or not that core runs multiple threads. A task is an execution of a user’s code, whether or not that code spawns multiple threads. Typically, the SLURM resource manager will launch one or many tasks across one or multiple nodes.

Note that these definitions are slightly different from SLURM’s definition. The following discussion pertains to submitting jobs to Moab using msub. A brief comparison of Moab and SLURM is included at the end of this document.

Submitting Jobs to Moab

Moab has always offered the option for a job to request a specific number of nodes (i.e., msub -l nodes=<n>) on LC parallel clusters. LC policy on parallel clusters is to allocate entire nodes to jobs for their exclusive use, i.e., only one job can be running on any node at any time. In contrast, LC policy constrains jobs running on its capacity clusters (Aztec and Inca) to just one node. Jobs submitted to those machines instead request tasks (i.e., msub -l ttc=<t>), and Moab allocates one processor per task requested. (ttc stands for total task count.) Hence, Moab could schedule 12 one-task jobs per Aztec or Inca node, or one 12-task job, or anywhere in between. As such, Aztec and Inca are the only machines that allow multiple jobs to run on one node.

LC’s various parallel clusters have differing numbers of processors per node. Users can submit jobs to a grid of Linux clusters and specify multiple clusters as potential targets. For example:

msub -l partition=clusterA:clusterB:clusterC

Moab will place the job in a queue and run the job on the first available cluster from the list of candidates.

Past Problem Now Solved

Some parallel jobs launch a specific number of tasks (assuming a one task to one processor relationship for this example), and the number of nodes needed to provide those processors is irrelevant. Hence, users would like to submit jobs to Moab using the following command form

msub -l procs=<p>

and have Moab decide whether to allocate X nodes of cluster A or Y nodes of cluster B to provide those processors. Moab v5.3 was unable to provide this flexibility.

Moab v6.1 offers the msub -l procs=<p> option without the need to specify a node count. If one cluster has 8-processor nodes and a second cluster has 16-processor nodes, Moab will allocate more or less nodes for the job depending on which cluster was selected to run the job.

(4)

2

Recommendations

The following tables provide a summary of how to allocate nodes and processors using Moab’s msub command. The first table provides the simplest command forms, with “Yes” or “No” indicating whether that form is appropriate for a specific cluster. The second table displays accepted (but unnecessarily complicated) command forms.

Recommended Command Forms Parallel Clusters Aztec/Inca

msub -l ttc=<t> No Yes msub -l nodes=<n> Yes No msub -l procs=<p> Yes Yes

Accepted Command Forms Parallel Clusters Aztec/Inca msub -l nodes=1,procs=<p> same as

msub -l procs=<p> same as msub -l ttc=<p> msub -l nodes=<n>,procs=<p> where n > 1 same as

msub -l procs=<p> Invalid msub -l nodes=1,ttc=<t> same as

msub -l nodes=1 same as msub -l ttc=<t> msub -l nodes=<n>,ttc=<t> where n > 1 same as

msub -l nodes=<n> Invalid

Moab and SLURM

Moab’s msub command requests a node/processor resource allocation. After scheduling the resources, Moab dispatches the job request to SLURM, which runs the job across those allocated resources.

There is a strict relationship between the resources Moab allocates and SLURM’s launching of the user’s executables across those resources. SLURM’s srun command is used to launch the job’s tasks. The srun command offers a set of node/task options that mirror those of msub described above.

There are obvious constraints. For example, a job cannot request a one-node allocation from Moab and then use srun to launch tasks across multiple nodes. The table below provides some valid and invalid examples of the required consistencies between msub and srun options on the Aztec and Inca clusters.

#MSUB -l ttc=8 srun -n8 a.out

Correct. Uses 8 processes on 8 processors. #MSUB -l ttc=4

srun -n8 a.out

Incorrect. Uses 8 processes on only 4 processors. #MSUB -l ttc=4

srun -n8 -O a.out

This command will work because the -O option permits SLURM to oversubscribe a node’s processors. However, using more tasks than processors is not recommended and can degrade performance. #MSUB -l ttc=64

srun -n64 a.out

Incorrect if 64 is more than the number of physical processors on a node.

# (no specification for -l ttc)

srun -n2 a.out

Incorrect. The #MSUB -l ttc= option isn’t specified, so the default of 1 is less than required by the -n2 tasks specified with srun.

#MSUB -l ttc=8 srun -n2 a.out

(5)

3

Further Reading

After Moab allocates the node/processor resources to the job, you can use srun to alter the default one-to-one relationship between tasks and processors. Information on the following options is available in the srun man page.

srun --cpus-per-task srun --ntasks-per-node

References

Related documents

یتخانش - شناد ،یراتفر زومآ.. ظن دوخ ،لیصحت م ییوج یم ار تاساسحا و تارکفت ناونع هب ناوت تیلاعف و یلیتتصحت فادها هب یبایتتتسد روظنم هب شوجدوخ یاه درک فیرعت ] 3 [ مظندوخ

This policy option aims to increase competition by incentivizing the entry of MVNOs into the Canadian market. A large barrier of entry for firms is the large amount of

$ The$Sixth$International$Four$Societies$Conference$ $ Hosted$by:$

Abbreviations: ISS-N1, Intronic splicing silencer N1; LDI, long-distance interaction; URC, U-rich cluster; ASO, antisense oligonucleotide; 2′-OMe, an ASO with

PhD evaluation committees: Copenhagen University, Technical University of Denmark, Aarhus University, Chalmers University of Technology (Sweden), Royal Institute of Technology

If neither a major ED nor an ambulance service can easily access this capability for incidents involving one or two critically injured patients then they will struggle to

facilitates the implementation of the client’s informed decision-making. After the initial investment, the Financial Adviser insures that their client understands the