Posts
Jun, 18
Gdev: First-Class GPU Resource Management in the Operating System
Graphics processing units (GPUs) have become a very powerful platform embracing a concept of heterogeneous many-core computing. However, application domains of GPUs are currently limited to specific systems, largely due to a lack of "first-class" GPU resource management for general-purpose multi-tasking systems. We present Gdev, a new ecosystem of GPU resource management in the operating […]
Jun, 18
An Improved CUDA-Based Implementation of Differential Evolution on GPU
Modern GPUs enable widely affordable personal computers to carry out massively parallel computation tasks. NVIDIA’s CUDA technology provides a wieldy parallel computing platform. Many state-of-the-art algorithms arising from different fields have been redesigned based on CUDA to achieve computational speedup. Differential evolution (DE), as a very promising evolutionary algorithm, is highly suitable for parallelization owing […]
Jun, 18
OpenCL for programming shared memory multicore CPUs
Shared memory multicore processor technology is pervasive in mainstream computing. This new architecture challenges programmers to write code that scales over these many cores to exploit the full computational power of these machines. OpenMP and Intel Threading Building Blocks (TBB) are two of the popular frameworks used to program these architectures. Recently, OpenCL has been […]
Jun, 18
Solving the Vlasov equation for one-dimensional models with long range interactions on a GPU
We present a GPU parallel implementation of the numeric integration of the Vlasov equation in one spatial dimension based on a second order time-split algorithm with a local modified cubic-spline interpolation. We apply our approach to three different systems with long-range interactions: the Hamiltonian Mean Field, Ring and the self-gravitating sheet models. Speedups and accuracy […]
Jun, 18
OpenACC – First Experiences with Real-World Applications
Today’s trend to use accelerators like GPGPUs in heterogeneous computer systems has entailed several low-level APIs for accelerator programming. However, programming these APIs is often tedious and therefore unproductive. To tackle this problem, recent approaches employ directive-based high-level programming for accelerators. In this work, we present our first experiences with OpenACC, an API consisting of […]
Jun, 17
GPUmotif: An Ultra-Fast and Energy-Efficient Motif Analysis Program Using Graphics Processing Units
Computational detection of TF binding patterns has become an indispensable tool in functional genomics research. With the rapid advance of new sequencing technologies, large amounts of protein-DNA interaction data have been produced. Analyzing this data can provide substantial insight into the mechanisms of transcriptional regulation. However, the massive amount of sequence data presents daunting challenges. […]
Jun, 17
Branch and Data Herding: Reducing Control and Memory Divergence for Error-tolerant GPU Applications
Control and memory divergence between threads within the same execution bundle, or warp, have been shown to cause significant performance bottlenecks for GPU applications. In this paper, we exploit the observation that many GPU applications exhibit error tolerance to propose branch and data herding. Branch herding eliminates control divergence by forcing all threads in a […]
Jun, 16
ScatterAlloc: Massively Parallel Dynamic Memory Allocation for the GPU
In this paper, we analyze the special requirements of a dynamic memory allocator that is designed for massively parallel architectures such as Graphics Processing Units (GPUs). We show that traditional strategies, which work well on CPUs, are not well suited for the use on GPUs and present the thorough design of ScatterAlloc, which can efficiently […]
Jun, 16
E-MOGA: A General Purpose Platform for Multi Objective Genetic Algorithm running on CUDA
This paper introduces an Enhanced Multi Objective Genetic Algorithm (E-MOGA) running on Compute Unified Device Architecture (CUDA) hardware, as a general purpose tool that can solve conflict optimization problems. The tool demonstrates significant speed gains using affordable, scalable and commercially available hardware. The objectives of this research are: to enhance the general purpose Multi Objective […]
Jun, 16
Accelerating Lambert’s Problem on the GPU in MATLAB
The challenges and benefits of using the GPU to compute solutions to Lambert’s Problem are discussed. Three algorithms (Universal Variables, Gooding’s algorithm, and Izzo’s algorithm) were adapted for GPU computation directly within MATLAB. The robustness of each algorithm was considered, along with the speed at which it could be computed on each of three computers. […]
Jun, 16
Parallel Primitives based Spatial Join of Geospatial Data on GPGPUs
Modern GPU architectures closely resemble supercomputers. Commodity GPUs that have already been equipped with personal and cluster computers can be used to boost the performance of spatial databases and GIS. In this study, we report our preliminary work on designing and implementing a spatial join algorithm on GPUs by using generic parallel primitives that have […]
Jun, 16
GiST Scan Acceleration using Coprocessors
Efficient lookups in huge, possibly multi-dimensional datasets are crucial for the performance of numerous use cases that generate multiple search operations at the same time, like point queries in ray tracing or spatial joins in collision detection of interactive 3D applications. These applications greatly benefit from index structures that quickly filter relevant candidates for further […]