• No results found

Chapter 5 Efficient 3D PIV interrogation algorithms

5.2 Performance assessment

The performances are assessed in four different layouts: synthetic distributions and small uniform sub-pixel displacement; shear displacement on distributions reconstructed by 5 MART iterations; synthetic distributions of a simulated jet profile; real images of a swirling jet. Each layout provides information on different features of the algorithm: the small uniform displacement highlights the performance limit of the algorithm; the shear displacement layout on reconstructed distribution focuses on the effects of the ghost particles in presence of spatial velocity gradients; the synthetic generated jet profile tests the algorithm in presence of tuned strong local gradients (in this case the effect of the ghost particles in modulating the velocity gradient along the depth direction provides undesirable anisotropy of the performances of the algorithm; consequently the analysis is performed on the original synthetic distributions); the real images of a swirling jet bundle all these aspects together.

The tests are carried out on a computer using a single core (so that the parallelization coding effects are reduced) of a 3.07 GHz i7 processor (with the exception of the last case, in which all 4 cores have been employed). In many cases the results will be presented in terms of speed-up with respect to a standard reference method (i.e. the full FFT analysis performed on the same grid and with the same IV size); this speed-up is only relative to the predictor and corrector evaluation steps (i.e. the processing time to compute the dense predictor field, to interpolate the volumes and to execute the validation is neglected).

From this moment on, the search radius for the direct correlation and block FFT based algorithms will be set to 1 and 2 voxels, respectively.

Chapter 5 – Efficient 3D PIV interrogation algorithms

5.2.1 Test case 1 - Uniform displacement

Synthetic distributions of spherical particles with Gaussian profile and 3 voxels diameter are interrogated. A volume of is discretized with resolution, resulting in voxels; the IV is 403 voxels. A

uniform displacement of 0.2 voxels is imposed, in order to assure that all the displacements are within 1 pixel search radius and, therefore, direct correlations always succeed in finding the correlation peak. Seeding concentrations of 1.25 and 3.75 particles/mm3 (resulting in approximately 0.5% and 1.5% of voxel with non-

zero intensity respectively when a threshold at 0.5% of the peak intensity is applied) are tested. 20 iterations are executed to obtain well converged time statistics.

The speed-up with respect to the equivalent process with computation of the full cross-correlation map by FFT is reported in Fig. 5.3. Standard direct correlations, without the aid of sparse matrices and redundancy avoidance, enable a processing time reduction of approximately 4 times, irrespective of the IV overlap. Among the others schemes reducing the redundant calculations and enjoying the sparsity of the distributions, block direct correlations are very effective when the overlap ranges between 25% and 75%. Indeed, in this case the number of operations in the cross-correlation computation step, being almost independent of the overlap, is much smaller than that relative to both 1D DC and 2D DC. In case of 75% overlap, the processing time is reduced of about 800 and 400 times for the lowest and the highest tested density, respectively, with respect to the standard processing by FFT. For the highest tested overlap the most performing approach is 1D DC that enables to have, for the smaller source density, a speed-up of over 1000 times.

The Block FFT algorithm, with a search radius of 2 pixels, is slower than the standard method based on FFT when the IV are only slightly overlapped, mainly because of the overhead due to summations of the maps. Increasing the overlap boosts up the efficiency of the Block-FFT approach; however, the speed-up is lower than that of the methods based on sparse direct cross-correlation and redundancy avoidance. Nevertheless, as already said, having a broader search area is still an advantage when the predictor is modulated, since a smaller number of cross- correlation maps have to be re-computed.

As shown in Fig. 5.4, where the normalized processing time with respect to the time to compute the standard FFT interrogation without overlapping windows is plotted as a function of the number of vectors per linear dimension of the IV, the methods with pre-calculations along segments (1D DC) or planes (2D DC) perform better than the one with block cross-correlations only when the overlap is limited or very high. As expected, 1D DC is normally faster than 2D DC, since the redundancy of operations is minimized. The number of operations varies almost

Fig. 5.3 Speed-up vs. overlap percentage for 1.25 (a) and 3.75 (b) particles/mm3.

Fig. 5.4 Normalized time (with respect to time to compute the standard FFT interrogation without

overlapping windows) as a function of the number of vectors for linear dimension of the IV for the lowest tested seeding density.

linearly with the number of vectors to be computed for each linear dimension of the IV in the case of 1D DC, while for 2D DC the dependence is approximately quadratic.

5.2.2 Test case 2 - One-dimensional shear displacement

In this second case a volume with the same geometric features of the Test case 1 is built starting from four independent views (4 cameras are placed on a horizontal xz plane, with a uniform angular displacement of 15°). The magnification is approximately 0.135, and it is almost uniform throughout the volume. The imaging is simulated with a pixel pitch of 6.67µm, and the f# is set to 13, so that the diameter of the particles is about 2.9 pixels. A custom-made software reconstructs the intensity distributions by 5 MART iterations, with an initial uniform first guest. The volume is discretized with , so that the resolution ratio is close to

Chapter 5 – Efficient 3D PIV interrogation algorithms