3.2 The Phase Field Method
3.2.7 Banded Optimization
A good deal of the computation in the simulation is extraneous, because in many cases a large portion of the phase field grid is homogeneously ice or water. Eqn. 3.1 and 3.2 are only nonzero in regions where the phase field is heterogeneous, so any computation time spent in homogeneous regions is wasted.
Some of the optimization techniques that have been proposed for phase field meth- ods. For example, adaptive mesh refinement techniques (Provatas et al., 1999) have been used to increase the resolution of the solution around regions of interest. Addi- tionally diffusion Monte Carlo techniques (Plapp and Karma, 2000) have been used to track the heat field far from the interface, resulting in significant computational savings. Far from the interface, heat is tracked as a set of particles whose dynamics are much cheaper to compute than fluxes over a mesh. However, these techniques also deal with the accurate simulation of solidification at scales much smaller than the mesh resolution. Since I am only concerned with visual simulation, these smaller scales are not of interest.
The optimization that is of interest in the adaptive mesh and DMC methods is the localization of computation to the grid cells along the interface. This goal can be achieved using a simple method, similar to the “narrow band” optimization method used for level set methods (Adalsteinsson and Sethian, 1995a). In the narrow band level set method, instead of solving over the entire computational domain, computation is
restricted to a narrow band of grid cells surrounding the region of interest. The region of interest in level set methods is usually a small neighborhood around the φ= 0 level set. In phase fields, a similar method could be used because the region of interest is the p= 0.5 isocontour. Since all computation takes place using finite differencing, the interface can move a maximum of one grid cell per iteration. If I restrict computation to grid cells that had a nonzero derivative on the previous iteration and their corresponding neighbors, then I will restrict computation of Eqn. 3.1 and 3.2 to only those grid cells that could potentially change. This simple and effective optimization offers the same computational localization as the adaptive mesh and DMC methods, while adding minimal implementation complexity.
Table 3.2 compares banded and unbanded performance. I used the simulation from Figure 3.2(a) as the test case, and ran all of the simulations to the same physical time on a 1.73 Ghz Pentium 4. As the resolution was increased, the number of iterations are increased because the size of the timestep is reduced to maintain numerical stability. As the resolution increases, the performance gain of the banded method appears to level off at about 5.5x. This performance gain will vary based on the input, but significant performance gains should be observed in all but the most pathological cases.
At first glance, it appears that the performance gain should continue to increase as the resolution increases. While this may be true for the narrow band level set method, it is not for narrow band phase fields. In narrow band level sets, a neighborhood several grid cells thick need to be maintained around the interface. As the resolution increases, the grid cell sizes decrease and the physical thickness of the narrow band decreases as well. In the case of phase fields, I am tracking a band of values where the time derivative is non-zero, such as regions with significant heat flow. These regions do not shrink as the grid is refined. The fractional area of the simulation domain that they occupy remains static, which is why the speed gain levels off as the resolution increases. The narrow band takes up about 20% of the simulation domain, regardless of the grid resolution. Faster performance at lower resolutions can probably be attributed to memory hierarchy effects.
Grid Size Iterations Unbanded Banded Speedup 1282 500 8s 1s 8.0x 1922 750 22s 2s 11.0x 2562 1000 45s 4s 11.0x 3202 1250 1m 14s 8s 9.2x 3842 1500 2m 15s 19s 7.1x 4482 1750 3m 10s 28s 6.8x 5122 2000 4m 28s 42s 6.3x 5762 2250 6m 2s 59s 6.1x 6402 2500 8m 22s 1m 20s 6.3x 7042 2750 10m 15s 1m 45s 5.9x 7682 3000 12m 47s 2m 16s 5.6x 8322 3250 15m 57s 2m 50s 5.6x 8962 3500 20m 41s 3m 39s 5.7x 9602 3750 23m 20s 4m 18s 5.4x 10242 4000 29m 39s 5m 19s 5.6x
Table 3.2: Banded vs. Unbanded Performance. I ran the same simulation at different resolutions. Higher resolutions required more iterations because of timestep restrictions. The speed gained from the banded version appears to level off at about 5.5x.
If the simulation is run long enough, the ice will expand to engulf most of the simu- lation domain. In this case, most of the domain will also be near a p= 0.5 isocontour, and the banded optimization will offer no significant speed advantage. However, almost all simulations start with little to no ice anywhere in the domain, and this is when the localization of the banded optimization offers the most drastic performance benefits. From a design standpoint, the initial stage of ice growth are also the part of the sim- ulation that must run the fastest. When designing ice patterns, the ability to quickly preview the results of a parameter change is crucial to a smooth workflow.