Parallel processing of the Monte Carlo computer programme

Given the small area of the detector (200 µm diameter), most light emitted from the source fibre would not find its way to the detector. Hence any simulation of the problem would be very slow. For this reason, the programme was written so that it could be run simultaneously on a number of computers, of various types (PC’s, workstations etc) on one computer network that was accessible to all the computers. This reduced considerably the time needed to perform the simulation, as it was performed on a number of computers in parallel.

This method was developed as a result of problems encountered with an earlier attempt to use Monte Carlo technique to simulate Laser Doppler blood flow measurements, where thirteen different Sun workstations were used143. Unfortunately, with many computers writing to many different data files, it became difficult to monitor the progress of the calculation. The random nature of the Monte Carlo method meant that each computer could be writing data to its files at anytime, making it impossible to safely add the data files from many computers for analysis, without first stopping the processes, summing the data, then restarting the computation, which was very tedious and error prone. A brief description of the parallel implementation of the programme developed to overcome this problem follows.

The description of the problem to be solved was held in one small ASCII file. This contained the measurement geometry, tissue optical properties (µa,µsand g), position of source and detector, fibre characteristics etc. The Monte Carlo simulation software read this from disk, performed the simulation, then wrote the photon history results, consisting of intensity, position, pathlength and angle with respect to the detection fibre into a 4 dimensional array of memory. After a minimum of 1000 photons were launched, the results were written to a file on a disk accessible to all computers on the network. One copy of the programme ran on each computer available, usually at reduced priority when being executed on Unix based machines that implement scheduling priorities.

Only one results file was stored on the computer network. Each computer performed the Monte Carlo calculations and wrote the photon history results into this one results file. Since two computers editing the one file could easily result in file corruption, the time of the day at which any computer could update the data file was fixed. The first computer to execute the programme wrote a file named 0 into the directory where the data was stored. The file did not need to contain any data at all, although when the computer was running under the Unix operating system, the machine’s name and the process identification number (PID) were written into the file in ASCII to enable the user to determine on which machines the programme had been

started. The second execution of the programme, on finding the file named 0, wrote a file named 1. A third execution of the programme wrote a file named 2 and so on, up to the 24th_execution, which wrote a file named 23. The source code was written such that the programme that wrote file 0, could only store data to disk between the times of midnight and 00:55 am (until this time the programme continued to run and accumulate data into the local machines memory). The second programme to run (which wrote the file 1) could only store data between the times of 01:00 and 01:55, the third stored data between the times of 02:00 and 02:55 etc. This ensured that each execution of the programme could only edit the data file at a time when no other programme was doing this. Obviously, this limited the number of separate executions of the programme to 24, but even this could be extended if necessary, by using smaller time steps than one hour. Conversely, if the number of computers available was smaller (say 6) it would probably be preferable to write the data once every six hours, rather than once per day, as this would reduce the time that could elapse without the data file being updated. For safety, no programmes could edit the data file in the last 5 minutes of any hour, since time must be allowed for writing the file. If for any reason (such as hardware failure) a process was unable to write the data to disk in the 55 minutes allocated for the task, the data was retained (and continued to further accumulate) in the computer’s random access memory (RAM) until the next day, when another attempt would be made to write the data to disk.

An important feature of the

4-byte integer 12345678 in 'big-endian' memory format16

4-byte integer 12345678 in 'little-endian' memory format16

12 34 56 7816 16 16 16

0 7 8 15 16 23 24 31

0 7 8 15 16 23 24 31 78 56 34 1216 16 16 16

Figure 8.6 Diagram showing how a 4-byte

integer (standard integer on 32 bit computers) would be stored in memory in both big-endian and little-endian formats.

programme is the way that one data file could be updated by any computer on the network, which consisted of both Sun workstations and PC’s. Initially, the photon data was stored in the random access memory (RAM) of each computer in whatever format the computer normally used. This would be big-endian on a Sun computer, but use little- endian on a PC144_{, as shown in figure 8.6.}

Before writing data to disk, there was a choice of using either binary or text. Binary is much more efficient in its use of disk space, using only 8 bytes to write a double precision floating point number (the data type used for most of the data), compared to 16 bytes if text were chosen. It is also simpler to read and write binary data, since one can also know exactly where the data will be in a disk file - something that is not so easy to achieve when data is written as

text. Unfortunately, since PCs and Suns store binary data in a different format, storing binary data in a way that both PCs and Suns can access it, is not without its problems. To circumvent this, data written to disk was always stored in the Sun (big-endian) format. Any PC version of the programme stored data internally in RAM in the little-endian format, but reversed the order of the bytes to convert to big-endian format before writing to disk.

Before a programme wrote data to a disk, it first ensured that the data in the data file was for the same conditions - ie geometry, tissue optical properties etc. Both the number of launched and detected photons were incremented along with all intensity and pathlength data from the programme.

A paper describing this parallel Monte Carlo programme was published in 1997145.

In document A Picosecond Optoelectronic Cross Correlator using a Gain Modulated Avalanche Photodiode for Measuring the Impulse Response of Tissue (Page 193-195)