By using both parallelization via domain decomposition and via multi-spin coding, we can achieve very high speeds for simulating the Ising-model, up to 160 million sites per second on a Power-processor of the JUMP; with a clock frequency of 1.7 GHz, this translates to roughly ten cycles per spin. On an Opteron processor with 2.2 Ghz, speed is 90 million sites per second (25 cycles per spin).
When using a parallelized version on N processors, we are interested in the speedup S(N ) over a single processor. This speedup is often governed by Amdahl’s
law [Amda67], S(N ) = 1/(s+1−sN ), with s signifying the purely sequential portion of
the program, and 1 −s the purely parallel one. When keeping the overall lattice size L constant, we have so-called “strong scaling”, cf. figure 5.12; when using larger L for higher N , we have “weak scaling” (i. e. we adapt problem size to our computational capability).
One sequential part of the program is the communication for exchanging the border planes; as each processor needs to communicate with two neighbouring pro- cessors, the wall clock time spent on communicating does not depend on the number of processors, but on the amount of data that needs to be transferred (two times L sites for decomposition into strips of L × L/N).
5.7
Summary and outlook
Although it is possible to argue that the numerical data for very long times is doubt- ful, as the influence of fluctuations on the value of z increases, the current precision data seems bad for the Domany-Swendsen assumption. It would be possible to modify the fit to the data, but the trend for long times rather contradicts the value of z = 2. It is thus safe to say that the dynamical critical exponent is z > 2 with simple power-law behaviour, with current best estimate of z = 2.167(3).
0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40 S(N) N
Figure 5.12: Speedup for two-dimensional Ising model, + show median value of five runs. Solid line is Amdahl’s law with a sequential portion of s = 0.005. Total lattice size is kept constant at L = 10080.
generators is important, in order to find one which allows precise data, but is still
fast enough for large-scale simulations: for L = 1.5 · 105, averaged over 50 runs,
Ziff’s four-tap generator produces results which differ systematically from other
generators and system sizes. This is not the case for a single run with L = 2 · 106
and four-tap generator. Data for L = 5 · 104, averaged over 25 runs, showed the
same behavior as for L = 1.5 · 105. For these lattice sizes, R(471,1586,6988,9689)
seems not to be suited well.
Furthermore, the influence of finite-size effects for long times should be investi- gated in more detail, and the effects of fluctuations in the magnetization in general.
Summary and Outlook
6.1
Summary
In the previous chapters, we have examined several numerical methods for investi- gating the properties and the critical behaviour of percolation and the Ising model. In chapter 2, replication of the traditional Hoshen-Kopelman algorithm was pre- sented as a viable method for investigating fluctuations of cluster numbers. In chapter 3, a parallelized version of the Hoshen-Kopelman algorithm was presented, which uses domain decomposition of the hyperplane of investigation into strips, therefore reducing the amount of memory that needs to be stored per processor. This method allows for implementation on massively-parallel computers with dis- tributed memory, which utilize message-passing. Due to the chosen method of domain decomposition, it is possible to simulate huge lattices also for dimensions d > 2, while other approaches to domain decomposition published in literature are severely limited for d > 2. In chapter 4, a modification of the Hoshen-Kopelman algorithm was presented, which allows to simulate percolation on growing lattices, i. e. to simulate a lattice of size L1 and then extend it to size L2> L1 without re-
investigating the whole lattice; only newly added sites have to be investigated. This approach is very suitable for studying finite-size effects and generally all properties in dependence on system size. Chapter 5 deals with the Ising model. A method already used for chapter 3, i. e. domain decomposition, is also applied to this model, together with the method of multi-spin coding. This allowed for simulating huge lattices over many timesteps in order to investigate critical behaviour with high accuracy.
For all simulations of percolation, follwing values for pc were chosen:
d pc
2 0.5927464
3 0.311608
4 0.196889
5 0.1407966
When investigating fluctuations in percolation, we found that the distribution of
cluster numbers nsfor fixed size s is Gaussian for small s and large L. The position
of the maximum (i. e. hnsi) is in compliance to the power law ns∝ s−τ. For large
s or small L, the left-hand side of the distribution gets distorted and vanished,
while the right-hand side stays quasi-Gaussian, i. e. it fits exp(−const · sζ), with
ζ = 2 for small s, and ζ decaying for increasing s. Variance of cluster numbers shows the same behaviour as the mean cluster numbers, with deviations for small s, i. e. (hn2
si − hnsi2)/hnsi = 1 + k2s−∆2, with k2 = 0.25(5) and ∆2 = 1.2(2).
The skewness of the distribution grows linearly with s, while the kurtosis grows, 75
but without clear power-law behaviour. Both skewness and kurtosis are no good measures for large s, due to the strong distortion of the distributions.
By simulating huge lattices, we found (τ is the Fisher exponent, known exactly
for d = 2; ∆1is the exponent for corrections to scaling; nc is the number density):
d L τ ∆1 nc
2 7000000 187/91 0.73(2) 0.027597857(2)
3 25024 2.190(1) 0.65(5) 0.05243812(9)
4 1305 2.315(2) 0.48(8) 0.0519995(2)
5 225 2.41(1) 0.30(10) 0.0460321(2)
Simulations for d = 3, 4, 5 are world records. Results for d = 2 are for the largest published simulation.
By investigating growing lattices, we found (τ and ncas above; nspis the number
of spanning clusters; D is the fractal dimension of the largest cluster, known exactly for d = 2):
d τ nsp nc D
2 187/91 0.52(1) 0.0275979(4) 91/48
3 2.1895(10) 0.15(3) 0.0524381(1) 2.52(1)
nsp shows a delicate finite-size behaviour.
For the Ising model in two dimensions, we found with world record simulations that the dynamical critical exponent is not z = 2 with logarithmic corrections, but with high probability z > 2 with simple power-law behaviour, with the current best estimate z = 2.167(3).
This thesis has shown, like many other works in the recent years, that super- computing is a valuable tool for physics, giving rise to the branch of computational physics, sometimes considered to be the third branch after experimental and the- oretical physics. Due to ever increasing computer power, problems can now be investigated with numerical methods, which seemed to be completely inaccessible only one or few decades ago. Although purely analytical solutions are preferable to numerical work, it is now possible to decide questions that cannot be answered by paper and pencil alone.
Modern supercomputers are always massively-parallel; in order to use these ef- ficiently, it is necessary to choose parallel approaches. Two very different meth- ods have been demonstrated in this thesis: replication and domain decomposition. While domain decomposition is regarded to be the real thing and replication is gen- erally frowned upon (some even call it “poor man’s parallelization”), replication can be good science, too, and is suited very well for many problems. Domain decom- position enables us to simulate huge lattices, proved by world records obtained for this thesis.