Posts
Dec, 31
Fast Computing Adaptively Sampled Distance Field on GPU
In this paper we present an efficient method to compute the signed distance field for a large triangle mesh, which can run interactively with GPU accelerated. Restricted by absence of flexible pointer addressing on GPU, we design a novel multi-layer hash table to organize the voxel/triangle overlap pairs as two-tuples, such strategy provides an efficient […]
Dec, 31
Efficient Triangle and Quadrilateral Clipping within Shaders
Clipping a triangle or a convex quadrilateral to a plane is a common operation in computer graphics. This clipping is implemented by fixed-function units within the graphics pipeline under most rasterization APIs. It is increasingly interesting to perform clipping in programmable stages as well. For example, to clip bounding volumes generated in the Geometry unit […]
Dec, 31
Fast Speaker Diarization Using a Specialization Framework for Gaussian Mixture Model Training
Most current speaker diarization systems use agglomerative clustering of Gaussian Mixture Models (GMMs) to determine "who spoke when" in an audio recording. While state-of-the-art in accuracy, this method is computa-tionally costly, mostly due to the GMM training, and thus limits the performance of current approaches to be roughly real-time. Increased sizes of current datasets require […]
Dec, 31
A GPU Accelerated Volumetric Ray Tracer for Incandescent Gas
The initial goal of this project was to create a physically accurate GPU-accelerated simulation of fire. Due to limited time available in the semester (combined with the inherent difficult of debugging CUDA code) we ended up reducing the scope somewhat and focusing on a realistic GPU-accelerated technique for rendering incandescent gas, as in flames, without […]
Dec, 31
Boosting quantum evolutions using Trotter-Suzuki algorithms on GPUs
The evolution calculation of quantum systems represents a great challenge nowadays. Numerical implementations typically scale exponentially with the size of the system, demanding high amounts of resources. General Purpose Graphics Processor Units (GPGPUs) enable a new range of possibilities for numerical simulations of quantum systems. In this work we implemented, optimized and compared the quantum […]
Dec, 31
Fast K-selection Algorithms for Graphics Processing Units
Finding the kth largest value in a list of n values is a well-studied problem for which many algorithms have been proposed. A naive approach is to sort the list and then simply select the kth term in the sorted list. However, when the sorted list is not needed, this method has done quite a […]
Dec, 31
Mapping the SBR and TW-ILDCs to Heterogeneous CPU-GPU Architecture for Fast Computation of Electromagnetic Scattering
In this paper, the shooting and bouncing ray (SBR) method in combination with the truncated wedge incremental length diffraction coefficients (TW-ILDCs) is implemented on the heterogeneous CPU-GPU architecture to effectively solve the electromagnetic scattering problems. The SBR is mapped to the GPU because numerous independent ray tubes can make full use of the massively parallel […]
Dec, 31
Hierarchical Stochastic Motion Blur Rasterization
We present a hierarchical traversal algorithm for stochastic rasterization of motion blur, which efficiently reduces the number of inside tests needed to resolve spatio-temporal visibility. Our method is based on novel tile against moving primitive tests that also provide temporal bounds for the overlap. The algorithm works entirely in homogeneous coordinates, supports MSAA, facilitates efficient […]
Dec, 31
Comparison of Fragmentation/Dispersion Models for Asteroid Nuclear Disruption Mission Design
This paper considers the problem of developing statistical orbit predictions of nearEarth object (NEO) fragmentation for nuclear disruption mission design and analysis. The critical component of NEO fragmentation modeling is developed for a momentum-preserving hypervelocity impact of a spacecraft carrying nuclear payload. The results of the fragmentation process are compared to static models and results […]
Dec, 31
Optimising the DBCSR GPU Implementation
The DBCSR library solves the sparse matrix multiplication required to perform atomistic simulations using the CP2K software. The GPU implementation of DBCSR was targeted for optimisation, and having its scope increased to allow it to function with larger block sizes. It was found that the main kernel could be sped up by 16% by augmenting […]
Dec, 31
Torch7: A Matlab-like Environment for Machine Learning
Torch7 is a versatile numeric computing framework and machine learning library that extends Lua. Its goal is to provide a flexible environment to design and train learning machines. Flexibility is obtained via Lua, an extremely lightweight scripting language. High performance is obtained via efficient OpenMP/SSE and CUDA implementations of low-level numeric routines. Torch7 can easily […]
Dec, 31
Hardware-Assisted High-Efficiency Ray Casting of Unstructured Time-Varying Flows Using Temporal Coherence
Advances in computational power are enabling high-precision numerical simulations of unsteady flows using unstructured grids. The dynamic ray casting technique with the aid of texture hardware can achieve high-accuracy volume rendering of unstructured time-varying data from these simulations. However, the existing approach does not pay enough attention to temporal coherence, which depresses the rendering rate. […]

