High Productivity Computing on Windows HPC
by Jerald Cheong,
(Windows HPC Technical Specialist, Microsoft Singapore)
Overview
Microsoft® Windows® HPC Server 2008 (HPCS), the next generation of high performance computing (HPC), provides enterprise-class tools, performance, and scalability for a highly productive HPC environment. HPCS provides a complete and integrated cluster environment including the operating system, a Job Scheduler, Message Passing Interface v2 (MPI2) support, and cluster management and monitoring components. Built on Windows Server® 2008 64-bit technology, HPCS can efficiently scale to thousands of processing cores and includes a management console that helps proactively monitor and maintain system health and stability. Job scheduling interoperability and flexibility enables integration between Windows and Linux-based HPC platforms, and supports batch and service-oriented architecture (SOA) workloads. Enhanced productivity, scalable performance and ease of use are some of the features that make Windows HPC Server 2008 best-of-breed for Windows environments.
Users whose work demand HPC solutions employ applications that execute complex computations and elaborate data output. Microsoft has worked with independent software vendors (ISVs) to port applications to Windows HPC Server 2008 that serve several markets, including manufacturing, life sciences, geological sciences, government, higher education, and financial services.
Supported ISVs
Applications for Windows HPC Server
By now, you would have realised that we have put a lot of focus on usability and integration, and we have worked with various ISVs to ensure that their software work well with Windows HPC Server. We have compiled a list of ISVs whose software are certified to be working on Windows HPC Server 2008. However, this does not mean that you cannot start leveraging Windows HPC in what you do. In fact, if you have been doing some form of computationally intensive work, there might be a very good chance that you can leverage on Windows HPC Server 2008.
Some of the more popular software include ANSYS’s FLUENT and Mechanical APDL products, CD-Adapco’s Star-CCM+, Dassault Systèmes Simulia’s Abaqus FEA and Mathworks Inc’s MATLAB. For a more comprehensive list, you might want to pop over to
At the Crossroads
Windows HPC Server 2008 R2 Beta
As I was writing this article, our HPC team had just released the Release Candidate (RC) version of the next version of Windows HPC Server 2008. With Windows HPC Server 2008 R2, we bring even more exciting features to Windows HPC Server 2008 R2 in the form of Excel integration.You will notice that Microsoft has just released a whole slew of products recently in the form of Office 2010 and Visual Studio 2010, and if you dig slightly deeper, you will also realise that there is tighter integration between Office 2010, Visual Studio 2010 and Windows HPC Server 2008 R2.
HPC and Excel
With the new version of Windows HPC Server 2008 R2, the HPC team presents up to two additional solutions to accelerating Excel on HPC. The solutions are Excel Runner and UDF Off-Loading1
The SOA Model
. Both solutions allow existing work that has been done on Excel to be quickly and easily accelerated on Windows HPC.
You might be wondering, why did I even mention this? After all, when Windows HPC Server 2008 was released, Microsoft had already presented a solution to “accelerate Excel”. At the point of release, Windows HPC Server 2008 presented a model whereby we could enable users to accelerate Excel. The solution was to implement the calculations in an SOA model. In the SOA model, short running calculations are encapsulated into a Web Service running in a stripped down, miniscule implementation of a Web Service container on the compute nodes. Calculations are implemented as a Web Service such that the queries are embedded in a Web Service request and the results
returned as a Web Service response.
On the surface, this looks like an extremely prospective solution. However, if we dig a little deeper into it, we will realise that we are looking at a few road-blocks. I will attempt to list them down here:
1. Quants/Financial Engineers are not programmers, they are mathematicians. 2. Programming paradigm.
3. Existing code base.
I’ll try and go into the specific details now.
Quants/Financial Engineers are not programmers, they are mathematicians.
This is the truth of the matter. In fact, I did attempt to get my hands dirty by doing some SOA programming. Quite frankly, it was not pleasant. If we examine the model closely, this is what we are getting ourselves into.
Firstly, we are looking at Visual Studio Tools for Office (VSTO), and anyone who has touched VSTO will attest that it is a beast to handle. That is fine if you are a programmer hired by an organisation to create Office specific extensions and add-ons, and usually in such cases, the interfaces, the
requirements are specified very clearly! However, when we delve into the realm of financial engineering, things are looking a lot different now. In all of my working experience with them, whatever these people create are highly secretive. In fact, not many people know of what goes into
the formulas and the knowledge is contained within a highly select few. In most cases, such intellectual property is confined within the group and the institution’s expectation is that it is the quants/financial engineers who will write the programs themselves. So, getting them to learn VSTO will be a huge hurdle. Chances are they will also have to learn another programming language, which is probably C#.
So you would not expect them to specify interfaces and so on, on the contrary, you would be expected to teach them how to handle VSTO! On top of VSTO, they would also have to handle the change in programming paradigm, which I will go into next.
Programming Paradigm
The SOA model presents a vastly different programming paradigm from what most quants and financial engineers are used to. In setting up the SOA model, they are required to create custom controls embedded into spreadsheets. They need to wrap function calls from within the spreadsheet, collect all the various variables, and then pass them off to the transport layer. They need to create a basic transport layer to pass off the collected information to the Web Service which they would have to create as well. When all that is done, they would then need to collect the results back and feed them back into the spreadsheet.
If that is not all, there is a need to debug. In order to debug this, they will need a small setup and quite frankly, this may be quite challenging in the context of a financial institution. And I can understand why there is strong resistance on their part to move to this model.
Existing Code Base
Now that we know that the guys from the financial institutions do not do VSTO, which leads us back to the start - VBA in Excel. Why? VBA in Excel is an extremely easy to start platform for customising Excel behavior, coupled with the fact that we can password-protect VBA codes within Excel to keep them from prying eyes. Thus, VBA naturally forms the starting point where quants and financial engineers come up with their magical financial products. However, encapsulating VBA into Excel can cause the spreadsheet to blow up in size significantly. Furthermore, it is not very portable in the sense that it is locked into a single Excel file, and if you were try to bring it out, someone else would see the source code.
The natural evolution would be to move to User Defined Functions (UDFs). So why are UDFs attractive? UDFs are compiled codes and are built on top of the C/C++ language. It is generally accepted that C/C++ compiled codes run the fastest apart from programming in assembly.
Furthermore, there are lots of reference materials and sample codes for UDFs that makes them easy to start on. Lastly, UDFs are compiled into XLLs, which are Excel-specific DLLs, which makes them small, highly portable and easily obfuscated. Linking to the UDFs just requires a user to point Excel to the location of the UDF and, straightaway, they can start using the functions encapsulated within the XLL without knowing how the functions are being implemented. By moving to UDFs, they gain speed, portability and they get to keep the knowledge on how the financial products are being created. Now with VBAs and UDFs, they present an extremely attractive value proposition and the evidence clearly shows. Most financial institutions that I have worked with have a good mix of
implementation in VBA and UDF. Moving away from VBA and UDFs implies a whole lot of re-engineering that probably does not make commercial sense.
Excel Runner and UDF Off-loading
Now that we know the challenges that the SOA model presents, how do Excel Runner and UDF Off-loading help? We have also seen that most of a financial institution’s work is already deeply entrenched in VBAs and UDFs. Well, Excel Runner and UDF Off-loading are particularly exciting because, in both cases, they make it easy for users to move to an HPC environment.
In the case of Excel Runner, the solution allows users to take the entire spreadsheet, and run it parallel across a HPC cluster. We have seen that in a lot of cases, users have VBAs embedded within spreadsheets, so since it is not possible, or rather, we do not want to de-couple the VBAs embedded, then it is likely that we want to run the spreadsheet in its entirety in parallel and Excel Runner is just the perfect solution for this!
How about UDF Off-loading you might ask? Well, with UDF Off-loading, we just need to make one change to the source code and almost immediately, with a bit of setup, we can start running those UDFs on a cluster. So how do we do it?
UpdatingYour XLL for Cluster Calculation
Rebuilding the XLL
We will have to rebuild the XLL so it supports calculation on the HPC cluster. There are three changes to make to any XLL to support cluster-enabled UDFs.
1. Update the XLL for the Excel 2010 SDK
If an XLL was built for an earlier version of Excel, you will need to update the SDK files to match the latest version. There are three important files: XLCALL.h, XLCALL.cpp, and XLCALL32.lib (the header, source file, and static library, respectively). The Excel2010 SDK includes updated versions of these files.
In Visual Studio, you will need to remove the files XLCALL.h and XLCALL.cpp from the XLL project. Add the new versions of these files (included in the download files in the Excel2010 directory) to the project. Update the project properties to set the include directory to the Excel2010 directory; in the project properties dialog, update the value in Configuration Properties > C/C++ > General >
Additional Include Directories to point to the Excel2010 directory.
This XLL does not use the library XLCALL32.lib. If you are updating an XLL that links against
XLCALL32.lib, change the linker settings to point to the updated version of this file. The Excel 2010 SDK includes versions of XLCALL32.lib built for both 32-bit and 64-bit architectures.
If you are updating an existing XLL library, you will need to link against XLCALL32.lib if you include any calls to the functions Excel4 or Excel4v. Beginning with Excel 2007, Microsoft included support for the new XLOPER12 type as a replacement for the XLOPER type. The Excel 2007 SDK also included the functions Excel12 and Excel12v, which replaced the older Excel4 and Excel4v functions.
If you don’t need to support versions of Excel prior to Excel 2007, you can build XLLs entirely without any calls to Excel4 or Excel4v. However if you are updating an existing library, including these functions won’t affect your ability to build enabled UDFs. Any function you mark as cluster-enabled must not use Excel4 or Excel4v – it should only use XLOPER12s and the Excel12 or Excel12v
functions. However, other functions in the same library can use the older functions and older value type.
The function we’ll update in this example only uses the XLOPER12 type, so that’s not an issue here. 2. Update the function registration signature
If you’re familiar with XLLs and UDFs, you’ll know that any function added to Excel must be
registered. Registration is typically handled in the xlAutoOpen or xlAutoRegister callback functions. When a function is registered, you pass in a string representing the argument types and return value. In this example, the function registration is handed in xlAutoOpen.
For example, a function which returns a XLOPER12 type and takes two integer and two floating point inputs, is registered with the type signature "QJJBB". That means it returns an XLOPER12 type (Q), and takes a variety of parameters as Integer (J) and Double (B) types.
To mark a function as cluster-safe, the only change to the function registration signature is adding an ampersand (“&”) to the function registration. Change the function registration string to match the following:
" YourFunctionName", " QJJBB&", " YourFunctionName",
so that the function registration string ends with an ampersand. That’s the only change that’s required to the code to mark the function as cluster-enabled.
However it’s simple in this case only because we know that the function is safe for running on the cluster. And it is the responsibility of the programmer to ensure that the function is safe for running in the cluster.
3. Rebuild the XLL
Rebuild the XLL with the changes you just made. If it compiles successfully, you’re ready to run the UDF on the cluster. If you receive any compilation errors, double-check the changes from the last section. In particular, make sure that the path to the new XLCALL.h file is correct.
Test the updated XLL on the desktop
Before running on the cluster, load up the XLL into your existing workbook implementation and make sure that the XLL is functioning correctly. If you don’t enable cluster calculation, the UDF will run on the desktop as before. Change your parameters and check that the calculations update as expected.
Deploy the XLL to the cluster
Now that the Add-in is updated, you can deploy it to the cluster. The XLL library must be installed on each compute node (or each compute node you will use to run Excel UDF calculations).
You have a few options in deploying the XLL to the cluster compute nodes. The suggested directories for XLLs are
for 64-bit XLLs, and
C:\Program Files\Microsoft HPC Pack 2008 R2\bin\Excel\XLL32 for 32-bit XLLs.
However, when a compute node attempts to load an XLL library it will search the PATH, including your user home directory. If you have Administrator privileges on the cluster, use the suggested directories above. If you don’t have Administrator privileges, you can use your home directory (eg C:\Users\MyUserName).
The easiest way to deploy the XLL to the cluster compute nodes is to use clusrun, an application provided with the HPC Client Utilities (we can run clusrun from HPC Powershell). You will need a share directory that is visible to the compute nodes. Create a share directory on the cluster head node, for example.
If you can create a share directory on the cluster head node, you can use the following steps to deploy the XLL to the cluster compute nodes:
1. Create a share directory on the head node. Name this directory “HPCShare” or something similar.
2. Copy the XLL library from your project build directory to the share directory. Open an HPC Powershell window (from the Windows Start menu, Programs > Microsoft HPC Pack 2008 R2 > HPC Powershell) and navigate to the project build directory. Run the command
> copy YourClusterUDF.xll \\Path\to\share\directory
3. Use clusrun in HPC Powershell to copy the file from the share directory to the target directory on the cluster compute nodes. In this example, we’ll use the default directory for 64-bit XLLs; adjust the path to use the directory appropriate for your installation.
>clusrun /scheduler:HeadNodeName /all copy
\\Path\to\share\directory\YourClusterUDF.xll 'C:\Program Files\Microsoft HPC Pack 2008 R2\bin\Excel\XLL64'
(enter the command on one line). The clusrun command tells the cluster head node (identified by the /scheduler parameter) to run the command on all nodes (with the /all parameter). It will copy the file from the share directory to the target directory on all cluster nodes.
Run the UDF on the HPC cluster
Now that the UDF is deployed, you’re ready to run it on the cluster. Go back to Excel and your spreadsheet.
Click the File tab in the ribbon and click Options on the left. In the Excel options dialog, click
Scroll down to the Formulas section:
Check the box marked Allow user-defined XLL functions to run on a compute cluster. In the Cluster type drop-down box, select the entry for x64 or x32, matching your version of Excel.
If you don’t see any settings for Cluster type, or the drop-down box is disabled, make sure you have installed the HPC Server 2008 R2 client utilities. Run the HPC Server 2008 R2 installer on your desktop, and install the client utilities.
Next click the Options button to open the cluster options dialog:
In the options dialog, change the setting for Cluster head node name to match your installation. Check the box Show status window during calculation. Leave the rest of the settings at their default values, and click Apply.
Now that you have enabled cluster calculation, any time Excel finds a function marked cluster-enabled (with an ampersand in the function signature, as you saw in the last section), it will contact the cluster head node and send a calculation request.
Make a change to the spreadsheet and you’ll see the cluster calculation run. When the calculation executes, you’ll see a dialogue box with calculation status:
If you see any errors during the calculation, click the Show Errors button (not pictured here) for more information. Some errors are not fatal; if you see errors of the type “Session Terminated”, you can safely ignore them. The “Session Terminated” error means that the cluster session was closed unexpectedly – in that event, Excel will automatically open a new session and re-submit the calculation requests as necessary.
The most common problem at this point will be an error of the type “UDF”; this usually means that the UDF was not deployed correctly. If you receive an error of type “UDF”, check that the XLL is correctly installed on the cluster compute nodes. Double-check your settings against the description in the previous section. If possible, open a remote desktop connection to one of your cluster
compute nodes and ensure that the file is installed in the correct directory. If necessary, re-deploy the file and try again.
If you receive any other errors, check the event log for more information. In the event viewer, you’ll find these events logged under Applications and Services Logs > Microsoft > HPC > Excel. There are additional categories for administrative, debug and operational logs.
Now that we have seen how easy it is to cluster enable UDFs, there are, however, some points that we need to take into consideration.
Some Considerations
Cluster-enabled UDFs are only supported in Excel 2010
In Excel 2010, the user can elect to run calculations on the desktop or on the cluster using the configuration option. However when a cluster-enabled UDF is registered, it includes the ampersand (“&”) in the parameter declaration. Earlier versions of Excel will not recognise this character and as a result they will not register the function. That means that a function which is marked as cluster-enabled will not work in Excel 2007 or any previous version.
If you need to support both Excel 2010 and previous versions of Excel, consider building multiple versions of the XLL library: one with the cluster-enabled function for Excel 2010, and one without the ampersand in the declaration to support Excel 2007 (or earlier versions).
Another option is to build multiple functions in your XLL, one with cluster support and one without. Users can then make the determination which function to use based on their environment.
You must rebuild the XLL to create cluster-enabled UDFs
There is no way to “automatically” treat existing UDFs as cluster-enabled UDFs. To run on the cluster, the UDF must include the ampersand symbol in its function registration. That means you must have access to the source code and the ability to recompile the UDF in order to take advantage of cluster calculation.
Because there are significant limitations and restrictions on cluster-enabled UDFs, it’s important that code be specifically compiled to support execution on the cluster. In addition to changing the function signature, rebuilding the XLL gives you the opportunity to verify that you are not using any unsupported data types or making any calls to the Excel API which are not supported on the cluster.
In Conclusion
In short, Windows HPC Server 2008 R2 coupled with the latest release of Excel 2010 enables users to easily and quickly enable their spreadsheets and UDFs to be cluster ready. This lowers the barrier for Excel users to start on HPC. However, that being said, it must be noted that all these capabilities are only available in Windows HPC Server 2008 R2 and Excel 2010 (which is currently not available in NUS yet, but it will be shortly, Windows HPC Server 2008 R2 is slated to be released sometime in August 2010). If there are existing concerns that some work cannot be moved, then there might be additional efforts to provide backwards compatibility like implementing multiple function calls with similar implementations.
Furthermore, since cluster ready UDFs can be tested locally, it is easier for developers to debug and test their codes on local machines rather than investing a testing infrastructure that might not be feasible.
I hope this short introduction has given you a good insight into the capabilities presented by Windows HPC Server 2008 R2 and the type of integration it has for Excel 2010. Feel free to get in touch if there are any queries.
Credits
The author would like to thank Duncan Werner of Structured Data LLC for his guidance on cluster enabling UDFs. The link to his document can be found here.
Disclaimer:
The views expressed in this document are entirely author views and are NOT endorsed by his employer, organisation, co-citizens, family, friends or co-workers. More ever, these views can change from time to time as he learns and gains experience from the past. The author does not assume responsibility over comments made in this document.