• No results found

Support of Parallel Environments Parallel Environments

In document Sun Grid Engine Manual (Page 145-150)

A Parallel Environment (PE) is a software package designed for concurrent computing in networked environments or parallel platforms. A variety of systems have evolved over the past years into viable technology for distributed and parallel processing on various hardware platforms. Examples for two of the most common message passing environments today are PVM (Parallel Virtual Machine)1 and MPI (Message Passing Interface)2. Public domain as well as hardware vendor provided implementations exist for both tools.

All these systems show different characteristics and have segregative requirements. In order to be able to handle arbitrary parallel jobs running on top of such systems, Sun Grid Engine provides a flexible and powerful interface satisfying the various needs.

Arbitrary PEs can be interfaced by Sun Grid Engine as long as suitable start-up and stop procedures are provided as described in section "The PE Start-up Procedure" on page 150 and in section "Termination of the PE" on page 151, respectively.

Configuring PEs with qmon

The ParallelEnvironmentConfiguration dialogue (see figure 2-53) is

opened upon clicking with the left mouse button on the PEConfig icon button in

the qmon main menu. The already configured PEs are displayed in the PEList

1.PVM, Oak Ridge National Laboratories 2.MPI, The Message Passing Interface Forum.

146 146 146

146Sun Grid Engine • July 2001

selection list on the left side of the screen. The contents of a PE list is displayed in the display region entitled with Configuration if the PE is selected by clicking on it

with the left mouse button in the PEList selection list.

A selected PE list can be deleted by pressing the Delete button on the right side of

the screen. Selected PE lists can be modified after pushing the Modify button and

new PE lists can be added after pushing the Add button. In both cases, the PE list

definition dialogue displayed in figure 2-54 is opened and provides the corresponding means.

The Name input window either displays the name of the selected PE list in the case

of a modify operation or can be used to enter the name of the PE list to be declared. The Slots spin box has to be used to enter the number of job slots in total which

may be occupied by all PE jobs running concurrently.

The QueueList display region shows the queues which can be used by the PE. By

clicking on the little icon button on the right side of the QueueList display region,

a SelectQueues dialogue as shown in figure 2-55 is opened to modify the PE

queue list.

The UserLists display region contains the user access lists (see section “User

Access Permissions” on page 122) which are allowed to access the PE while the

Xuser Lists display region enlists those access lists, to which access is denied.

The little icon buttons associated with both display regions bring up Select Access Lists dialogues as shown in figure 2-55. These dialogues have to be used to modify the content of both access list display regions.

FIGURE 2-53 FIGURE 2-53 FIGURE 2-53

Chapter 2 Installation and Administration Guide 147147147147

The StartProcArgs and StopProcArgs input windows are provided to enter

the precise invocation sequence of the PE start-up and stop procedures (see sections "The PE Start-up Procedure" on page 150 and "Termination of the PE" on page 151 respectively). The first argument usually is the start or stop procedure itself. The remaining parameters are command-line arguments to the procedures. A variety of special identifiers (beginning with a ’$’ prefix) are available to pass Sun Grid Engine

internal run-time information to the procedures. The sge_pe manual page in the

Sun Grid Engine Reference Manual contains a list of all available parameters.

The AllocationRule input window defines the number of parallel processes to

be allocated on each machine which is used by a PE. Currently, only positive integer numbers and the special value $pe_slots are supported. $pe_slots denotes

that all processes which are created have to be located on a single host.

The ControlSlaves toggle button declares whether parallel tasks are generated

via Sun Grid Engine (i.e. via cod_execd and cod_shepherd) or whether the

corresponding PE performs its own process creation. It is advantageous if Sun Grid Engine has full control over slave tasks (correct accounting and resource control), but this functionality is only available for PE interfaces especially customized for Sun Grid Engine. Please refer to section "Tight Integration of PEs and Sun Grid Engine" on page 152 for further details.

The Jobisfirsttask toggle button is only meaningful if ControlSlaves has

been switched on. It indicates, that the job script or one of its child processes acts as one of the parallel tasks of the parallel application (this is usually the case for PVM, for example). If it is switched off, the job script initiates the parallel application but does not participate (e.g. in case of MPI when using “mpirun”).

The modified or newly defined PE lists are registered as soon as the Ok button is

pressed, or they are discarded if the Cancel button is used instead. In both cases,

148 148 148

148 Sun Grid Engine • July 2001 FIGURE 2-54 FIGURE 2-54 FIGURE 2-54

FIGURE 2-54 Parallel environment definition dialogue

FIGURE 2-55 FIGURE 2-55 FIGURE 2-55

Chapter 2 Installation and Administration Guide 149149149149

Configuring PEs from the Command-line

The following options to the qconf command create and maintain parallel

environment interface definitions:

qconf -ap pe_name

add parallel environment. Brings up an editor (default vi or corresponding to the $EDITOR environment variable) with a PE configuration template. The parameter

pe_name specifies the name of the PE and is already filled into the corresponding field of the template. The PE is configured by changing the template and saving to disk. See the sge_pemanual page in the Sun Grid Engine Reference Manual for a

detailed description of the template entries to be changed.

qconf -dp pe_name

delete parallel environment. Deletes the specified PE.

qconf -mp pe_name

modify parallel environment. Brings up an editor (default vi or corresponding to

the $EDITOR environment variable) with the specified PE as configuration

template. The PE is modified by changing the template and saving to disk. See the

sge_pemanual page in the Sun Grid Engine Reference Manual for a detailed

description of the template entries to be changed.

qconf -sp pe_name

show parallel environment. Print the configuration of the specified PE to standard output.

FIGURE 2-56 FIGURE 2-56 FIGURE 2-56

150 150 150

150 Sun Grid Engine • July 2001 qconf -spl

show parallel environment list. Display a list of the names of all parallel environments currently configured.

In document Sun Grid Engine Manual (Page 145-150)