6.2 Concurrent k-NN searches using MMF
6.2.3 Concurrent search for k-NN
When computing the k-NN for a given point, our approach ensures that the cor- rect k-nearest neighbours are actually returned. In general, given two sets of points Pg and Pn, with Pn ⊂ Pg, the method ensures that the set Pg contains
the k-nearest neighbours of all points in Pn. As opposed to the work of Sankara-
narayananet al. (2007), i.e. pre-compute the setPgbefore searching for the k-NN
of points inPn, the method verifies that this is the case for each point inPn once
the k-NN are determined. Since each point is located in an axis-aligned cell, the shortest distancedbetween the position of the point and any one of the boundary planes of the cell can be determined very efficiently. Figure 6.4 describes how this is done in 2D. After determining k-NN, the algorithm checks whether the distance between the kth neighbour and the current point is smaller than d. If
it is, then the currently chosen neighbours are correct and can be returned oth- erwise the point is flagged for re-computation of k-NN taking in consideration a larger set of adjacent ghost cells. Algorithm 12 describes in detail how the search
Algorithm 11 Sort points in P and persist to files 1: Input P,G with counts for each cell, Clusters OC. 2: for each cluster OCi do
3: Create MMF to store points in OCi
4: Update Gwith file position offsets of cells in OCi
5: cnttotal = number of points in OCi
6: cntwritten = 0
7: for each pointpj ∈ P do
8: if pj falls within this cluster then
9: Retrieve cell Ck wherepj is located
10: Write pj to file at position offset indicated at Ck
11: cntwritten =cntwritten+1
12: Increment offset at Ck
13: end if
14: if cntwritten == cnttotal then
15: Flush MMF of OCi. 16: Continue. 17: end if 18: end for 19: end for for k-NN works.
Each processing element (PE) in the system atomically retrieves the next available cell in the currently active OC cluster and computes k-NN searches over all points in the cell. k-NN searches are carried out by creating a temporary kd-tree over points in the currently active grid cell. When all searches are done, the kd-tree is deleted from memory. Temporary kd-trees are created and deleted for all cells in G.
6.3
Results
The out-of-core extension to PaRSe is evaluated on a number of point clouds ranging in size from 53K to 333M points. All experiments are carried out on an Intel Core2Quad machine running Windows7 and SATA2 hard disks. In order to evaluate performance against different memory configurations, the same machine is installed with 1GB, 2GB, 4GB and 8GB of system RAM. Experiments are conducted in order to evaluate the scalability of the approach as the size of the point cloud is increased across these different memory configurations. In addi- tion to an implementation of the concurrent grid based multi kd-tree (GridXKd)
Algorithm 12 Compute k-NN for all pointspi ∈P
1: Input G, Cluster SetOC. 2: for each cluster OCi do
3: Memory map file with points in OCi
4: Update file position offsets of cells in OCi
5: Generate array CellArr storing keys of cells in OCi
6: cellCount = size(CellArr) - no. of ghost cells in OCi
7: crtCellIdx = index of first non ghost cell 8: while crtCellIdx < cellCount do
9: Atomically assign to PEcrtCellIdx
10: PE generates kd-tree on points inCellArrcrtCellIdx
11: for each pointpj inCellArrcrtCellIdx do
12: Search for k-NN of pj
13: d = shortest dist(pj,CellArrcrtCellIdx planes)
14: if dist(pj, N Nk) > d then
15: Add pj to k-NN re-computation list RL
16: end if
17: end for
18: while sizeof(RL) > 0do
19: Update kd-tree with points from adjacent cells
20: Compute k-NN for pj
21: d += extent of CellArrcrtCellIdx
22: if dist(pj, N Nk) < d then
23: Remove from re-computation list RL
24: end if
25: end while
26: Delete kd-tree
27: Atomically incrementcrtCellIdx
28: end while
approach described above, two further implementations are evaluated for com- parison. The first implementation takes the traditional in-core approach where a kd-tree is constructed over all points in the data set and is referred as in-core kd-tree (ICKd). This implementation should provide the best possible perfor- mance whenever enough memory is available to hold the kd-tree. The PCL library Rusu & Cousins (2011) is used for this implementation which also uses memory mapped binary files to store points. The second implementation works exactly like GridXKd, but does not use memory-mapped files and instead loads all points in the sparse grid data structure (rather than just the required number of points) before starting to compute k-NN and is referred to in-core concurrent grid based multi kd-tree (ICGridXKd). In all cases the FLANN library Muja & Lowe (2009a) is used to implement kd-tree based k-NN searches. The error-bound parameter is set in all cases to zero. Moreover in all implementations all four processing elements available on the computer used are utilised to concurrently compute k-NN.
Table 6.1 lists the point clouds used in the experiments. In all cases (except for Mnajdra and Songo) the data has been generated from polygonal models. In the case of SongoX2, SongoX4 and SongoX8, the original point cloud was up- sampled (using Algorithm 3) in order to increase the number of points. For each point in the original point cloud, an additional point is created as the spatial average of the two nearest neighbours. Figure 6.5 illustrates three of the point clouds used.
Model Name Size(M) Cell count in Grid
obelisk 0.053 1097 mnajdra 0.579 6087 conference 2.3 6338 sibenik 6 201,756 songo 41 95,999 songoX2 83 96,940 songoX4 166 96,853 songoX8 333 97,253
Table 6.1: Point clouds, corresponding number of points and number of cells created during loading phase in sparse grid