• No results found

Initialization and configuration on Midnight

Chapter 6. Summary and conclusions

A. i Matlab overview

A.2 Review of Parallel Matlab implementations

A.3.1 Initialization and configuration on Midnight

A procedure for starting a parallel Matlab session using the MDCS and spanning one or more Midnight compute nodes will be described. This procedure is a culmination of techniques developed for various Matlab projects to leveraged high-performance computing (HPC) resources at the Arctic Region Supercomputing Center (ARSC) from 2006 to 2009 and applies to Matlab version 7.6.0 running on the SUSE Linux operating system. Only the interactive mode of operation on Midnight, allowing a user to interactively execute Matlab commands on the allocated compute nodes, will be discussed in detail here. Since the parallel Matlab batch mode on Midnight is initialized with a similar sequence of steps, this focus will not suffer much loss of generality.

It is assumed that the user has connected to a Midnight login node from an xterm window using ssh with X forwarding enabled (i.e. using the command ssh -X). While a working X-server on the client workstation is not required to use the parallel Matlab interactive mode, it does allow the use of the Matlab graphical desktop that includes useful parallel configuration, coding, and debugging tools. The Matlab graphical configuration tools for parallel operation are particularly useful, but these tools are only needed to complete the initial configuration by each user the first time parallel

Matlab is run. Or if a standard configuration file is provided by ARSC consultants, then the configuration step may be skipped entirely.

The user connects to the Matlab client running on a Midnight compute node from a workstation through a terminal window or through an X session. Additionally, it is assumed here that the user has access to a Midnight filesystem like WORKDIR or HOME that is shared across the login and compute nodes, and is used by the Matlab workers to store logging or other information. Both the PCT and MDCS components shown in Figure A.2 reside on a set of compute nodes allocated to a specific user through the PBS Pro batch queue system. That is, the PBS Pro batch scheduler allocates a “cluster” of compute nodes to the user, and the user must initialize and set up their own personal Matlab PCT and MDCS environment. This process is automated with shell scripts, and with proper licensing, allows multiple users to simultaneously run their own private “Matlab cluster” on Midnight. Note that typical PCT and MDCS installation examples described by The Mathworks assume that the Matlab cluster may be shared by multiple users. However, security and data integrity requirements at ARSC do not easily facilitate this sort of use case on HPC- allocated shared resources.

The steps required to establish a parallel Matlab interactive mode session—with the graphical desktop interface enabled and displayed on the user workstation—using a cluster of Midnight compute are now described.

First, the tunnelx script is launched by the user in the X-forwarded ssh session connected to a Midnight login node. The tunnelx program allows an X-client on a compute node to connect with an X-server on the user workstation, allowing a Matlab instance on a compute node to display the graphical desktop on the user workstation. Then a script is submitted to the PBS Pro batch queuing system that requests a set of Midnight compute nodes to be allocated for the job, and then executes other initialization commands on the manager node (the remaining nodes in the

allocation will be referred to here as the worker nodes). This batch script performs the following tasks, described below.

From the shell running on the manager node allocated by PBS, the tunnelx program is launched again and the configuration parameters for the local session are loaded with the shell source command. An environment variable containing the address to the Matlab license server is set on

the manager node and the module command is used at this point to set other environment variables to paths containing the Matlab installation files.

Next, the Matlab Distributed Computing Engine (MDCE, part of the MDCS package) is started on the manager node and each of the worker nodes. Then the Mathworks jobmanager scheduler is started on the manager node and Matlab workers are started on both the manager and worker nodes. Usually up to one Matlab worker is started for each processor core in the allocated nodes, up to the number of available Matlab worker licenses. (In an approximate sense, jobmanager scheduler is to the Matlab workers as the PBS batch queuing system is to the Midnight compute nodes in that the jobmanager scheduler allocates Matlab workers to a queue of Matlab tasks that have been submitted by one or more users.)

Finally, an instance of Matlab is started on the manager node that provides the user interface or executes the main user program. The Matlab graphical desktop environment is displayed on the user workstation. If the parallel environment has not been configured yet, then the graphical configuration tool under the Matlab desktop parallel menu may be used to set the default job manager to the Mathworks job manager. Diagnostic tests are also available in the Matlab parallel configuration tool that will validate the operation of the parallel Matlab features and save

configuration file. The configuration step may be skipped if the new configuration is set as default and the associated configuration file is in the Matlab path (or the Matlab launch directory) the next time a parallel Matlab session is started.

At this point the user may, for example, start a matlabpool or pmode session from the Matlab workspace. A matlabpool is a set of Matlab labs15 (referred to here as workers) reserved by the user to complete concurrent tasks through programming constructs like the parfor parallel for loop or spmd single program multiple data block. A pmode session is a parallel-mode session where the user may send commands to each worker and view the results interactively; in particular, pmode sessions allow the interactive use of distributed matrices. The performance and scalability of the

15 The Parallel Computing Toolbox glossary defines a lab as follows: “When workers start, they work independently by default. They can then connect to each other and work together as peers, and are then referred to as labs.”

178

parfor loop connected to a matlabpool and non-trivial distributed matrix computations in a pmode session will be explored through the following case studies.

Related documents