Geometry representations and data structures evaluation

As was mentioned in one of previous chapters, selection of a right geometry representation and an underlying data structure is a key component of a geometry

processing system. In the case of the 5-axis milling simulation workpiece geometry is not limited as much as in the 3-axis milling and the height map representation used earlier in this work does not fit naturally for the new purpose. It means that a new geometry representation has to be selected or specially designed for this work.

Before going into discussion about possible solutions it is important to define a set of features and properties that a new geometry representation has to provide.

 First of all it has to be able to represent any possible geometry with high

precision and without topological limitations by using a reasonable amount of memory.

 From the parallel processing point of view, an underlying data structure

has to provide an ability to render and edit geometry in parallel with a high level of parallelism without significant synchronization overhead.

 From a scalability point of view it is important to be able to split a model

between multiple devices with very limited communication channels.

It is important to note that scalability and parallel processing are more important than performance of serial algorithms for processing this data structure since. In contrast to CPU, GPU performance grows over time and not limited too much by physical limitations. So it is much easier to increase the available performance linearly just by using more GPUs or using newer processors if a geometry representation can be scaled efficiently. Now, existing geometry representations can be evaluated based on the formulated requirement.

The most popular in the CAD/CAM field, BREP geometry representation

obviously meets the accuracy and memory usage criteria but fails both parallelization and scalability requirements. The problem with parallelization and especially GPGPU

processing of BREP is related to the mathematical complexity of the surface elements representation and an absence of spatial boundaries of surfaces elements. For example, there are no boundaries on a number of surfaces that represent a given workpiece surface region. A region can be represented either by a single surface or by a thousand of

surfaces. It means that there is no way to guarantee a high number of elements that can be processed independently in parallel and provide enough GPU load. Possible differences in a mathematical description of surface elements make the situation even worse since even independent surface elements cannot be processed by the same algorithms. It results in inefficient GPU utilization since multiple threads in the same warp have to wait for each other and cannot process elements concurrently.

The triangular mesh represents the tradeoff between the geometry accuracy and the surface representation complexity. It can be viewed as the BREP with surface elements simplified to planes connected to exactly three neighbors (triangles). Using triangles allows using the same algorithm for processing all surface elements. It results in much more efficient geometry processing on GPU, especially in the mesh rendering process, but it does not help with the geometry editing issues that the BREP has. In addition the approximation of a surface by triangles requires extremely large amount of memory for achieving high precision.

In contrast to boundary geometry representations like BREP and triangular mesh, the volumetric geometry representations have a completely different set of tradeoffs.

Probably one of the simplest volume representations is the voxel model. Voxel model subdivides an entire volume into a 3-dimensional grid of independent elements called voxels. Each voxel has a value that may represent some properties of that area of volume such as distance to the closest surface, amount of material or simply a presence of

material. The most important voxels property from the GPGPU point of view is their independence. It means that each voxel can be processed completely independent from other voxels and it can be done in parallel on multiple devices. Another important property of the traditional voxel model is the fact that a volume is sampled regularly. It means that there is a constant predefined number of voxels for a given volume and a given resolution. And it results in a very simple memory management. Although the voxel model looks like a perfect choice for GPGPU computing, since it has both parallelizability and scalability, it has an extremely important drawback. An amount of memory required for storing the voxel model is proportional to a third power of the model resolution. It makes completely unfeasible using it for precision tool path planning without additional algorithms that can overcome this limitation. For example, a 500mm cube represented with 2micron resolution as a voxel model will require ~14PetaByte of space for storage. This is approximately the same as an entire Google or AT&T process every day and it is definitely not feasibly for CAM applications.

In contrast to the regularly sampled voxel model, that provides perfect parallelizability and scalability, there is a class of irregular sampling volume

representation approaches. Usually irregularly sampled approaches are represented by trees such as the octree with constant cells ratio or the k-d tree with a variable cells ratio.

Irregularly sample models provide a tradeoff between memory requirements,

parallelizability and complexity of a memory management process. They need much less memory than the traditional voxel model, but a tree processing is usually implemented by recursion algorithms which are not well suitable for GPU processing since GPU kernels cannot launch other kernels (this feature is not available in GPUs currently available on the market at least). Tree processing on GPU is a tradeoff between the number of kernel launches (which is equal to a tree height) and the overhead required for launching each kernel and work planning. On one hand higher (or deeper) trees provide better resources usage and may provide higher accuracy and on another hand every additional level requires another kernel launch and jobs planning time. An additional problem of all tree based algorithms is the memory management. In the case of CPU processing, there is virtually no significant memory allocation or releasing penalty, and every thread can manage its memory independently. But there is no such natural mechanism for GPU and the implementation of a memory management system can significantly increase an

algorithm complexity, and add extra performance penalties. Although irregularly sampled volume representations have significant drawbacks related to GPU computing, and their implementation itself is not trivial, it is important to note that they still provide a high level of parallelizability and scalability. It means that an irregularly sampled volume represented by a tree can be a good starting point for designing a data structure for

GPGPU accelerated simulation and tool path planning but there are additional changes required since available implementations cannot be efficiently ported to GPU.

Approach Z-map BREP Trian.

Table V-1: Geometry representations comparison

In document Automated CNC Tool Path Planning and Machining Simulation on Highly Parallel Computing Architectures (Page 108-113)