Posts
Jul, 28
Speeding up LIP-Canny with CUDA programming
The LIP-Canny algorithm outperforms traditional Canny edge detection in terms of edge detection under varying illumination. This method is based on a robust mathematical model (LIP paradigm), which is closer to the human vision system. However, this model requires more computations and more complex operations than the traditional paradigm. Non-parallel implementations of LIP-Canny do not […]
Jul, 28
Computational modeling of synthetic microbial biofilms
Microbial biofilms are complex, self-organized communities of bacteria, which employ physiological cooperation and spatial organization to increase both their metabolic efficiency and their resistance to changes in their local environment. These properties make biofilms an attractive target for engineering, particularly for the production of chemicals such as pharmaceutical ingredients or biofuels, with the potential to […]
Jul, 27
A virtual memory based runtime to support multi-tenancy in clusters with GPUs
Graphics Processing Units (GPUs) are increasingly becoming part of HPC clusters. Nevertheless, cloud computing services and resource management frameworks targeting heterogeneous clusters including GPUs are still in their infancy. Further, GPU software stacks (e.g., CUDA driver and runtime) currently provide very limited support to concurrency. In this paper, we propose a runtime system that provides […]
Jul, 27
Speeding up Large-Scale Point-in-Polygon Test Based Spatial Join on GPUs
Point-in-Polygon (PIP) test is fundamental to spatial databases and GIS. Motivated by the slow response times in joining largescale point locations with polygons using traditional spatial databases and GIS and the massively data parallel computing power of commodity GPU devices, we have designed and developed an end-to-end system completely on GPUs to associate points with […]
Jul, 27
High-Performance Spatial Join Processing on GPGPUs with Applications to Large-Scale Taxi Trip Data
Spatially joining GPS recorded locations with infrastructure data, such as points of interests, road network, land cover and different types of zones, and assigning a point with its nearest polyline or polygon is a prerequisite for trip related analysis, which is becoming increasingly important in ubiquitous computing. However, existing spatial databases and GIS are incapable […]
Jul, 27
CUDA Kernel Design for GPU-Based Beam Dymanics Simulations
Efficient implementation of general purpose particle tracking on GPUs can result in significant performance benefits to large scale particle tracking and tracking-based accelerator optimization simulations. We present our work on accelerating Argonne National Lab’s accelerator simulation code ELEGANT [1, 2] using CUDA-enabled GPUs [3]. In particular, we provide an overview of beamline elements ported to […]
Jul, 27
Using OpenGL State History for Graphics Debugging
Graphics programming sees widespread use across a number of industries, from video games to medical imaging. Because of the unique needs of graphics programming, specialised tools are required to aid in the debugging of graphics programs. While there are a number of well-maintained and wide-spread general-purpose debugging tools, the range of available graphics debuggers is […]
Jul, 26
SnuCL: an OpenCL framework for heterogeneous CPU/GPU clusters
In this paper, we propose SnuCL, an OpenCL framework for heterogeneous CPU/GPU clusters. We show that the original OpenCL semantics naturally fits to the heterogeneous cluster programming environment, and the framework achieves high performance and ease of programming. The target cluster architecture consists of a designated, single host node and many compute nodes. They are […]
Jul, 26
Efficient Implementation of the CPR Formulation for the Navier-Stokes Equations on GPUs
The correction procedure via reconstruction (CPR) formulation for the Euler and Navier-Stokes equations is implemented on a NVIDIA graphics processing unit (GPU) using CUDA C with both explicit and implicit time-stepping schemes for 2D unstructured triangular grids. For the implicit time integration, a first order time approximation with Newton iteration and Gauy elimination is used […]
Jul, 26
Bidimensional Median Filter for Parallel Computing Architectures
The median filter is a non-linear filter used for removal of salt and pepper noise from images. Each pixel of the image is replaced by the median of its surrounding elements, the median value is calculated by sorting the data. The complexity of the sorting algorithms used on the median filters are O(n^2) or O(n), […]
Jul, 26
Parallelization of Data Intensive Code Using Computer Unified Device Architecture (CUDA)
Parallel processing is a form of computation in which many calculations are carried out simultaneously, operating on the principle that large problems can often be divided into smaller ones, which are then solved concurrently. Parallelism has been employed for many years, mainly in high-performance computing. As power consumption by Computer has become a concern in […]
Jul, 26
Homunculus Warping: Conveying importance using self-intersection-free non-homogeneous mesh deformation
Size matters. Human perception most naturally relates relative extent, area or volume to importance, nearness and weight. Reversely, conveying importance of something by depicting it at a different size is a classic artistic principle, in particular when importance varies across a domain. One striking example is the neuronal homunculus; a human figure where the size […]