## Delaunay Triangulation in R3 on the GPU

Department of Computer Science, National University of Singapore

National University of Singapore, 2012

@article{nanjappa2012delaunay,

title={DELAUNAY TRIANGULATION IN R 3 ON THE GPU},

author={NANJAPPA, ASHWIN},

year={2012}

}

The Delaunay triangulation of points in R3 is a fundamental computational geometry structure that is useful for representing and studying objects from the physical world. The 3D Delaunay triangulation has desirable qualities that make it useful in many applications like FEM, surface reconstruction and tessellating solids. Algorithms for 3D Delaunay have been devised that utilize a multitude of techniques and are suitable for single and multi-core CPUs and distributed memory systems. With the ubiquity of the GPU in cellphones, tablets, workstations and cloud computers, there has been a growing interest in 3D Delaunay triangulation algorithms for the GPU. This thesis presents 3D Delaunay triangulation algorithms that effectively utilize the massive parallelism of the GPU. The gFlip3D algorithm is designed to enable massively parallel point insertion and flipping in 3D on the GPU. The algorithm achieves a high level of parallelism performing one point insertion per thread and one flip operation per thread. For any type of input, less than 0.0001 of the facets in the output from this algorithm are not locally Delaunay. The CUDA implementation of this algorithm achieves a speedup of up to 6 times over the 3D Delaunay triangulator of CGAL. To provide a better quality triangulation as input to massively parallel flipping algorithms, this thesis examines the coloring and dualization of the digital grid in R3. We show that it is difficult to color a digital grid in 3D such that the dualized triangulation is topologically and geometrically valid. We also show that dualizing a 3D digital Voronoi vertex is not possible. As an alternative technique, we demonstrate the utility of grid perturbation to coloring and dualization so that a triangulation can be obtained from it. This thesis presents the gStar4D algorithm that constructs the 3D Delaunay triangulation by using the neighbourhood information in the digital grid as an approximation of the Delaunay triangulation. It achieves this by the massively parallel creation of stars of each input point lifted to R4 and the use of an unique star splaying approach to splay these 4D stars in parallel and make them consistent. The result is a convex hull of the lifted points and the 3D Delaunay triangulation can be obtained from its lower hull. The algorithm introduces a concept of reciprocated insertions that simplifies the inconsistency handling and an elegant technique to find the confinement proof of a point in a star. The CUDA implementation of gStar4D achieves a speedup of up to 5 times over the 3D Delaunay triangulator of CGAL. gDel3D is a heterogeneous GPU-CPU algorithm that repairs the near-Delaunay output of gFlip3D using a conservative star splaying approach on the CPU to obtain the 3D Delaunay triangulation. Stars are created only for the points in non-locally-Delaunay facets by using working sets from the triangulation. The star splaying approach conservatively creates other stars directly from the triangulation and once they are consistent repairs only the affected portion of the triangulation to obtain the 3D Delaunay triangulation. Our implementation of gDel3D achieves a speedup of up to 6 times over the 3D Delaunay triangulator of CGAL. The running time of gDel3D includes both the time taken by gFlip3D and that for fixing its output to Delaunay. The massively parallel techniques presented in this thesis are not only useful for 3D Delaunay triangulation, but can be extended and adopted to solve other computational geometry problems in R3 and R4 using the GPU. To demonstrate this, we extend the star splaying concepts of gStar4D and gDel3D algorithms to devise the gReg3D algorithm that can construct the 3D regular triangulation on the GPU. This algorithm allows stars to die, finds their death certificate and uses methods to propagate this information to other stars. The implementation of this algorithm achieves a speedup of up to 4 times over the 3D regular triangulator of CGAL. We also explore the concept of non-optimal flipping as a means to improve the quality of triangulation constructed from massively parallel point insertion. The algorithms described in this thesis show that the massive parallelism of the GPU can be harnessed to construct the Delaunay and regular triangulation in R3 for all types of inputs. We also show that these techniques can be adapted easily to solve other computational geometry problems in R3 and R4 using the GPU. This thesis also contributes the optimized and robust implementation in CUDA of all its algorithms that can be used with all types of inputs. This is made freely available on the internet to anybody from the scientific and engineering community. With these contributions this thesis lays the foundation for further work on computing the 3D Delaunay triangulation on the GPU.

June 18, 2013 by hgpu

Delaunay Triangulation in R3 on the GPU,