• No results found

Concluding Remarks

In document Parallel Markov Chain Monte Carlo (Page 169-173)

Chapter 6 Conclusions and Future Work

6.5 Concluding Remarks

This work has sought to bring together the disciplines of statistics, image process- ing and high performance systems. A number of parallelisation strategies have been devised and tested, most of which will compliment existing methods targeted at im- proving the rate of simulation convergence. Although the methods that are applica- ble and the performance improvements that can be obtained will vary depending on the particulars of the specific application and implementation, the estimates for the best possible reductions in runtime along with real-world examples obtained (using a variety of hardware architectures and representing a number of potential MCMC application characteristics) provide a good indication of whether a particular ap- plication will benefit. In some cases a fair estimation of the improvements can be obtained.

It is hoped that the pMCMC lowers the barrier of entry for would-be MCMC implementers, and facilitates more research into MCMC methods and applications. It is the intent that the parallelisation options provided with pMCMC will allow the practical implementation of more end-user MCMC applications by making infeasible and uneconomical methods feasible, and improve the efficiency of those MCMC applications already discovered.

Appendix A

The pMCMC Framework

Though not the main focus of this thesis, in the process of implementing and evalu- ating the aforementioned parallelisation methods a framework for the rapid develop- ment of MCMC applications was developed: pMCMC. An example implementation using this framework is given in appendix B and the runtime usage shown in ap- pendix C. The purpose of this framework is to separate the task of constructing and tuning an MCMC simulation for a specific application from the task of implementing the parallelisation methods presented in this thesis. As a side-effect, the implemen- tation of many tedious and/or error-prone aspects of an MCMC application have also been automated, including but not limited to user interaction (via XML and a variety of frontends), sanity checking, the Metropolis-Hastings kernel and generic aspects of the MCMC algorithm.

In section A.1 we briefly describe the MCMC method and its uses. Sec- tion A.2 gives an overview on how the pMCMC framework is used to create an MCMC application. Section A.3 describes some of the internal design structure and decisions that were made. Whilst for the most part users of pMCMC will not interact with these components, a commentary on them may be of interest to those considering extending pMCMC or writing their own MCMC program from scratch. In Section A.4 we show how the applications generated using pMCMC are used, and

in section A.5 we look at the overhead involved in the use of this framework.

A.1

Introduction

In its most basic form an MCMC algorithm is simple to implement, as demonstrated by the following psuedocode:

1 do { 2 ProposedMove p = makeProposal ( ) ; 3 double mh value = m et r op o l i sH a st i n gs (p ) ; 4 i f ( random()<mh value ) 5 apply (p ) ; 6 e l s e 7 abort (p ) ; 8 while( ! done ) ;

Transitioning from the seemingly straightforward sequential implementation to one or more of the parallel implementation described in chapters 3 to 5 can be a daunting task to those not accustomed to parallel programming. Extensive rewrites may be necessary if the transition is not planned for from the outset. For instance, specula- tive moves requires move proposals be created and evaluated without any changes to chain’s state, prohibiting a mechanism of applying then rolling back proposed changes should the proposal be rejected (a viable sequential implementation that has been encountered). Similarly, speculative chains demands the existence of sec- ondary (speculative) states to exist and be developed by MCMC iterations before being potentially merged with the primary state, and both speculative chains and periodic parallelisation require proposed moves to be suggested from a subset of the possible move types depending on the phase of the simulation (separatingMs and Mf for instance).

The programming knowledge and experience required for the technical im- plementation of parallel processing (pthreads [48], MPI [24, 30] and safe parallel

programming practises) will not necessarily be found by the initial developers of a MCMC method (whose experience will be focused on statistical algorithms and im- age processing). Additionally if one is developing a number of MCMC applications there is substantial repetition of effort and the writing of tedious and repetitive boilerplate code (i.e. for selecting between move types with the correct probabilities and generating the suitable proposed moves for the move type, based on probabil- ities and other parameters specified by the user is some fashion). The pMCMC framework was created to address these issues and provide a convenient testbed for the rapid testing of the parallelisation methods developed for this thesis.

Creation of the the code for performance monitoring, input/output of data and simulation properties and multithreading instructions serves as a further dis- traction from the MCMC implementor’s primary focus: the simulation’s model, the possible model transitions, and the efficient calculation of the posterior probability. To combat these problems and to make the creation of parallel MCMC applications more accessible to the theorists the pMCMC has been developed. Through a com- bination of a library of source files, templating and automatic code generation a specific MCMC application can be plugged into a generic parallel MCMC kernel in a manner that allows the programmer to focus purely on the application specific components of the MCMC application.

Functionality automatically provided by the pMCMC framework includes: • The implementation of the Metropolis-Hastings transition kernel.

• The speculative moves and speculative chains implemented using pthreads,

and the periodic parallelisation mechanism implemented using MPI.

• Multiple executables for different situations: for testing, for SMP machine execution, and for MPI execution.

• Use of XML files for configuring all the MCMC simulation variables (‘prior’ values, move proposal probabilities etc).

• Automatic generation of much ‘boilerplate’ and housekeeping code (for in- stance the random selection of a type of move to execute based upon the proposal proabilities provided via an XML job file, and the tracking of sim-

ulation statistics such as the average acceptance probability for each type of move).

• Recording of simulation metrics (timing of individual steps and the program as a whole, actual move acceptance rate).

• Optional XML logs of the MCMC simulation’s setup and statistics gathered during the simulation’s execution.

• Programmatic interface for integration with your own frontend (an optional OpenGL display is available).

On a Q6600 the pMCMC framework is capable of performing up to 3.2 million iterations per second whilst in sequential mode. Parallel processing performance is highly dependant on the specific characteristics of the application, but in practical tests using just one of the parallelisation methods available (speculative moves) allowed for up 40% reduction in runtime just by using a dual-core or dual-processor system, with no additional coding required compared to that for simple sequential execution.

In document Parallel Markov Chain Monte Carlo (Page 169-173)