Chapter 6. Summary and conclusions
A. i Matlab overview
A.2 Review of Parallel Matlab implementations
A.2.4 Parallel Computing Toolbox and Matlab Distributed Computing Server
The Mathworks, Inc. Parallel Computing Toolbox™ (PCT) provides high-level Matlab and
Simulink® programming tools, including language constructs for parallel processing on a multicore or multiprocessor workstation similar to a single compute node of the Midnight cluster. A copy of Matlab or Simulink is required to use the features of the PCT. The Matlab Distributed Computing Server (MDCS) extends the functionality of Matlab with the Parallel Computing Toolbox to multi workstation clusters or to multiple nodes of Midnight. Figure A.2 illustrates the workflow
relationship between the PCT and the MDCS12. Similar to how the Matlab problem-solving environment separates the user interface to matrix computations and manipulations from
hardware- and library software-dependent tasks like memory management and data input/output; the PCT separates the interface to task- and data-parallel operations from hardware- and network - dependent tasks13. One end result of such a separation is the dramatic reduction in programmer time necessary to implement a new parallel application or to convert an existing serial application to a parallel application.
12 http://www.mathworks.com/support/solutions/en/data/i-3U7V8R/index.html?product=DW
Desktop System
Parallel Computing loolbox
Local Workers ( ■ * M
A
]C D CS
t ■ * H * ) 1 M " * l Simulink, Blocksets, and Other ToolboxesMATLAB
Figure A.2 Workflow overview of the Parallel Computing Toolbox and Matlab Distributed
Computing Server. As configured on ARSC's Midnight cluster, both the Matlab client with the Parallel
Computing Toolbox (PCT) and the Matlab Distributed Computing Server (MDCS) are started and run on a set of compute nodes allocated to a user by the PBS Pro batch queue system. The image is provided by The Mathworks.
Both the Parallel Computing Toolbox and Distributed Computing Server products require licensing from the Mathworks in addition to the license that grants interactive use of Matlab by one or more users. The cost of the parallel computing license depends on, among other factors, the number of independent Matlab “labs” or “workers” available simultaneously. Each Matlab worker is the software equivalent of a processor core in the sense that it is capable of executing tasks and accessing or storing information independent of the other Matlab workers, yet is controlled by a central managing process. The number of Matlab workers used simultaneously is typically less than the number of processor cores available in the cluster. Just as the cost of a single-user Matlab license is likely to be significant relative to the acquisition cost of the destination workstation, the
m
cost of fully licensing the Parallel Computing Toolbox and Matlab Distributed Computing Server on a cluster is significant relative to the acquisition cost of the cluster.
For example, an academic license to use MDCS with 32 workers, the number of workers licensed at ARSC, costs $5,500 or $172 per worker annually14. The academic license cost per worker decreases to $115 for a 256-worker license, for a total annual cost of $29,500. Note that the license for the MDCS and its respective workers is purchased in addition to the prerequisite Matlab and PCT licenses, both of which are licensed per user. This means that a user who starts a matlabpool (discussed below) containing, for instance, 8 out of the 32 available Matlab workers at ARSC, then the remaining 24 worker licenses are inaccessible to other users if, as is the case at ARSC, there is only one single-user PCT license available.
A further note on Mathworks licensing is appropriate. A Matlab user at ARSC is subject to Matlab licenses purchased by one or more of the following organizations: the Arctic Region
Supercomputing Center, the University of Alaska, and the Department of Defense (DoD). While the set of ARSC users is broad and diverse in terms of organizational affiliation, the group of users authorized according to affiliation by each license from The Mathworks may be correspondingly narrow. For example, an ARSC user may be a student or staff member of the University of Alaska Fairbanks (or both), a student of the University of Alaska Anchorage, a visiting student from an outside university that is part of a summer research internship program, or a remote DoD user. At the time of writing in early January 2010, the University of Alaska Matlab site license used on the local ARSC workstations is plentiful and allows only University of Alaska Fairbanks users, while the multi-center Matlab license used on Midnight purchased by the DoD is scarce but allows any ARSC user to launch Matlab. Licensing for various combinations of the dozens of available Matlab toolboxes is similarly difficult for the user, since each license includes distinct but intersecting sets of toolboxes.
One additional and significant complication is that the license negotiated and purchased by the University of Alaska, the interpretation of the license, and attempts to remain in compliance with the license change annually, usually sometime between January and February. From the
perspective of an ARSC user of parallel Matlab, this means that the Matlab resource incorporated in
current projects must be assumed to expire at the end of each year, and that the project
implementation may need to be adjusted, perhaps significantly, in order to remain compatible with the new licensing scheme. On Midnight during 2009 for example, the Matlab Distributed
Computing Server is authorized by a license purchased by ARSC, but the Matlab and Parallel Toolbox products are authorized by a license purchased by the Department of Defense. But in 2008, Matlab and the Parallel Toolbox were authorized by the University of Alaska license, and will be again in 2010. Each January is a period that this author discovers so-called license instabilities in existing code.
A.3 Multi-node parallel Matlab on ARSC’s Midnight via the PCT and MDCS
Matlab, the Parallel Computing Toolbox, and the Matlab Distributed Computing Server are installed on the Midnight cluster at the Arctic Region Supercomputing Center. This section describes the parallel Matlab initialization procedure unique to Midnight in detail, and then two case studies are presented that explore the performance and scalability of certain parallel tasks.