A Parallel Environment (PE) is a software package designed for concurrent computing in networked environments or parallel platforms. A variety of systems have evolved over the past years into viable technology for distributed and parallel processing on various hardware platforms. Examples for two of the most common message passing environments today are PVM (Parallel Virtual Machine)1 and MPI (Message Passing Interface)2. Public domain as well as hardware vendor provided implementations exist for both tools.
All these systems show different characteristics and have segregative requirements. In order to be able to handle arbitrary parallel jobs running on top of such systems, Sun Grid Engine provides a flexible and powerful interface satisfying the various needs.
Arbitrary PEs can be interfaced by Sun Grid Engine as long as suitable start-up and stop procedures are provided as described in section "The PE Start-up Procedure" on page 150 and in section "Termination of the PE" on page 151, respectively.
Configuring PEs with qmon
The ParallelEnvironmentConfiguration dialogue (see figure 2-53) is
opened upon clicking with the left mouse button on the PEConfig icon button in
the qmon main menu. The already configured PEs are displayed in the PEList
1.PVM, Oak Ridge National Laboratories 2.MPI, The Message Passing Interface Forum.
146 146 146
146Sun Grid Engine July 2001
selection list on the left side of the screen. The contents of a PE list is displayed in the display region entitled with Configuration if the PE is selected by clicking on it
with the left mouse button in the PEList selection list.
A selected PE list can be deleted by pressing the Delete button on the right side of
the screen. Selected PE lists can be modified after pushing the Modify button and
new PE lists can be added after pushing the Add button. In both cases, the PE list
definition dialogue displayed in figure 2-54 is opened and provides the corresponding means.
The Name input window either displays the name of the selected PE list in the case
of a modify operation or can be used to enter the name of the PE list to be declared. The Slots spin box has to be used to enter the number of job slots in total which
may be occupied by all PE jobs running concurrently.
The QueueList display region shows the queues which can be used by the PE. By
clicking on the little icon button on the right side of the QueueList display region,
a SelectQueues dialogue as shown in figure 2-55 is opened to modify the PE
queue list.
The UserLists display region contains the user access lists (see section User
Access Permissions on page 122) which are allowed to access the PE while the
Xuser Lists display region enlists those access lists, to which access is denied.
The little icon buttons associated with both display regions bring up Select Access Lists dialogues as shown in figure 2-55. These dialogues have to be used to modify the content of both access list display regions.
FIGURE 2-53 FIGURE 2-53 FIGURE 2-53
Chapter 2 Installation and Administration Guide 147147147147
The StartProcArgs and StopProcArgs input windows are provided to enter
the precise invocation sequence of the PE start-up and stop procedures (see sections "The PE Start-up Procedure" on page 150 and "Termination of the PE" on page 151 respectively). The first argument usually is the start or stop procedure itself. The remaining parameters are command-line arguments to the procedures. A variety of special identifiers (beginning with a $ prefix) are available to pass Sun Grid Engine
internal run-time information to the procedures. The sge_pe manual page in the
Sun Grid Engine Reference Manual contains a list of all available parameters.
The AllocationRule input window defines the number of parallel processes to
be allocated on each machine which is used by a PE. Currently, only positive integer numbers and the special value $pe_slots are supported. $pe_slots denotes
that all processes which are created have to be located on a single host.
The ControlSlaves toggle button declares whether parallel tasks are generated
via Sun Grid Engine (i.e. via cod_execd and cod_shepherd) or whether the
corresponding PE performs its own process creation. It is advantageous if Sun Grid Engine has full control over slave tasks (correct accounting and resource control), but this functionality is only available for PE interfaces especially customized for Sun Grid Engine. Please refer to section "Tight Integration of PEs and Sun Grid Engine" on page 152 for further details.
The Jobisfirsttask toggle button is only meaningful if ControlSlaves has
been switched on. It indicates, that the job script or one of its child processes acts as one of the parallel tasks of the parallel application (this is usually the case for PVM, for example). If it is switched off, the job script initiates the parallel application but does not participate (e.g. in case of MPI when using mpirun).
The modified or newly defined PE lists are registered as soon as the Ok button is
pressed, or they are discarded if the Cancel button is used instead. In both cases,
148 148 148
148 Sun Grid Engine July 2001 FIGURE 2-54 FIGURE 2-54 FIGURE 2-54
FIGURE 2-54 Parallel environment definition dialogue
FIGURE 2-55 FIGURE 2-55 FIGURE 2-55
Chapter 2 Installation and Administration Guide 149149149149
Configuring PEs from the Command-line
The following options to the qconf command create and maintain parallel
environment interface definitions:
qconf -ap pe_name
add parallel environment. Brings up an editor (default vi or corresponding to the $EDITOR environment variable) with a PE configuration template. The parameter
pe_name specifies the name of the PE and is already filled into the corresponding field of the template. The PE is configured by changing the template and saving to disk. See the sge_pemanual page in the Sun Grid Engine Reference Manual for a
detailed description of the template entries to be changed.
qconf -dp pe_name
delete parallel environment. Deletes the specified PE.
qconf -mp pe_name
modify parallel environment. Brings up an editor (default vi or corresponding to
the $EDITOR environment variable) with the specified PE as configuration
template. The PE is modified by changing the template and saving to disk. See the
sge_pemanual page in the Sun Grid Engine Reference Manual for a detailed
description of the template entries to be changed.
qconf -sp pe_name
show parallel environment. Print the configuration of the specified PE to standard output.
FIGURE 2-56 FIGURE 2-56 FIGURE 2-56
150 150 150
150 Sun Grid Engine July 2001 qconf -spl
show parallel environment list. Display a list of the names of all parallel environments currently configured.