Banded Optimization - The Phase Field Method

3.2 The Phase Field Method

3.2.7 Banded Optimization

A good deal of the computation in the simulation is extraneous, because in many cases a large portion of the phase field grid is homogeneously ice or water. Eqn. 3.1 and 3.2 are only nonzero in regions where the phase field is heterogeneous, so any computation time spent in homogeneous regions is wasted.

Some of the optimization techniques that have been proposed for phase field methods. For example, adaptive mesh refinement techniques (Provatas et al., 1999) have been used to increase the resolution of the solution around regions of interest. Addi- tionally diffusion Monte Carlo techniques (Plapp and Karma, 2000) have been used to track the heat field far from the interface, resulting in significant computational savings. Far from the interface, heat is tracked as a set of particles whose dynamics are much cheaper to compute than fluxes over a mesh. However, these techniques also deal with the accurate simulation of solidification at scales much smaller than the mesh resolution. Since I am only concerned with visual simulation, these smaller scales are not of interest.

The optimization that is of interest in the adaptive mesh and DMC methods is the localization of computation to the grid cells along the interface. This goal can be achieved using a simple method, similar to the “narrow band” optimization method used for level set methods (Adalsteinsson and Sethian, 1995a). In the narrow band level set method, instead of solving over the entire computational domain, computation is

restricted to a narrow band of grid cells surrounding the region of interest. The region of interest in level set methods is usually a small neighborhood around the φ= 0 level set. In phase fields, a similar method could be used because the region of interest is the p= 0.5 isocontour. Since all computation takes place using finite differencing, the interface can move a maximum of one grid cell per iteration. If I restrict computation to grid cells that had a nonzero derivative on the previous iteration and their corresponding neighbors, then I will restrict computation of Eqn. 3.1 and 3.2 to only those grid cells that could potentially change. This simple and effective optimization offers the same computational localization as the adaptive mesh and DMC methods, while adding minimal implementation complexity.

Table 3.2 compares banded and unbanded performance. I used the simulation from Figure 3.2(a) as the test case, and ran all of the simulations to the same physical time on a 1.73 Ghz Pentium 4. As the resolution was increased, the number of iterations are increased because the size of the timestep is reduced to maintain numerical stability. As the resolution increases, the performance gain of the banded method appears to level off at about 5.5x. This performance gain will vary based on the input, but significant performance gains should be observed in all but the most pathological cases.

At first glance, it appears that the performance gain should continue to increase as the resolution increases. While this may be true for the narrow band level set method, it is not for narrow band phase fields. In narrow band level sets, a neighborhood several grid cells thick need to be maintained around the interface. As the resolution increases, the grid cell sizes decrease and the physical thickness of the narrow band decreases as well. In the case of phase fields, I am tracking a band of values where the time derivative is non-zero, such as regions with significant heat flow. These regions do not shrink as the grid is refined. The fractional area of the simulation domain that they occupy remains static, which is why the speed gain levels off as the resolution increases. The narrow band takes up about 20% of the simulation domain, regardless of the grid resolution. Faster performance at lower resolutions can probably be attributed to memory hierarchy effects.

Grid Size Iterations Unbanded Banded Speedup 1282 ₅₀₀ _8s _1s _8.0x 1922 ₇₅₀ _22s _2s _11.0x 2562 ₁₀₀₀ _45s _4s _11.0x 3202 ₁₂₅₀ _{1m 14s} _8s _9.2x 3842 ₁₅₀₀ _{2m 15s} _19s _7.1x 4482 1750 3m 10s 28s 6.8x 5122 2000 4m 28s 42s 6.3x 5762 ₂₂₅₀ _{6m 2s} _59s _6.1x 6402 ₂₅₀₀ _{8m 22s} _{1m 20s} _6.3x 7042 ₂₇₅₀ _{10m 15s} _{1m 45s} _5.9x 7682 ₃₀₀₀ _{12m 47s} _{2m 16s} _5.6x 8322 ₃₂₅₀ _{15m 57s} _{2m 50s} _5.6x 8962 ₃₅₀₀ _{20m 41s} _{3m 39s} _5.7x 9602 ₃₇₅₀ _{23m 20s} _{4m 18s} _5.4x 10242 ₄₀₀₀ _{29m 39s} _{5m 19s} _5.6x

Table 3.2: Banded vs. Unbanded Performance. I ran the same simulation at different resolutions. Higher resolutions required more iterations because of timestep restrictions. The speed gained from the banded version appears to level off at about 5.5x.

If the simulation is run long enough, the ice will expand to engulf most of the simulation domain. In this case, most of the domain will also be near a p= 0.5 isocontour, and the banded optimization will offer no significant speed advantage. However, almost all simulations start with little to no ice anywhere in the domain, and this is when the localization of the banded optimization offers the most drastic performance benefits. From a design standpoint, the initial stage of ice growth are also the part of the simulation that must run the fastest. When designing ice patterns, the ability to quickly preview the results of a parameter change is crucial to a smooth workflow.

In document Physically-based simulation of ice formation (Page 68-70)