Posts
Dec, 7
The 2011 International Conference on High Performance Computing & Simulation, HPCS 2011
The conference is to address, explore and exchange information on the state-of-the-art in high performance and large scale computing systems, their use in modeling and simulation and data intensive applications. We encourage papers with both an application or technology flavor (and their multidisciplinary integration). The scope covers architecture, performance, algorithms, middleware, and applications. Work on […]
Dec, 7
Performance evaluation of image processing algorithms on the GPU
The graphics processing unit (GPU), which originally was used exclusively for visualization purposes, has evolved into an extremely powerful co-processor. In the meanwhile, through the development of elaborate interfaces, the GPU can be used to process data and deal with computationally intensive applications. The speed-up factors attained compared to the central processing unit (CPU) are […]
Dec, 7
Fast support vector machine training and classification on graphics processors
Recent developments in programmable, highly parallel Graphics Processing Units (GPUs) have enabled high performance implementations of machine learning algorithms. We describe a solver for Support Vector Machine training running on a GPU, using the Sequential Minimal Optimization algorithm and an adaptive first and second order working set selection heuristic, which achieves speedups of 9-35x over […]
Dec, 7
Bandwidth intensive 3-D FFT kernel for GPUs using CUDA
Most GPU performance “hypes” have focused around tightly-coupled applications with small memory bandwidth requirements e.g., N-body, but GPUs are also commodity vector machines sporting substantial memory bandwidth; however, effective programming methodologies thereof have been poorly studied. Our new 3-D FFT kernel, written in NVIDIA CUDA, achieves nearly 80 GFLOPS on a top-end GPU, being more […]
Dec, 7
BSGP: bulk-synchronous GPU programming
We present BSGP, a new programming language for general purpose computation on the GPU. A BSGP program looks much the same as a sequential C program. Programmers only need to supply a bare minimum of extra information to describe parallel processing on GPUs. As a result, BSGP programs are easy to read, write, and maintain. […]
Dec, 7
Pangaea: a tightly-coupled IA32 heterogeneous chip multiprocessor
Moore’s Law and the drive towards performance efficiency have led to the on-chip integration of general-purpose cores with special-purpose accelerators. Pangaea is a heterogeneous CMP design for non-rendering workloads that integrates IA32 CPU cores with non-IA32 GPU-class multi-cores, extending the current state-of-the-art CPU-GPU integration that physically “fuses” existing CPU and GPU designs. Pangaea introduces (1) […]
Dec, 7
A single-pass GPU ray casting framework for interactive out-of-core rendering of massive volumetric datasets
We present an adaptive out-of-core technique for rendering massive scalar volumes employing single-pass GPU ray casting. The method is based on the decomposition of a volumetric dataset into small cubical bricks, which are then organized into an octree structure maintained out-of-core. The octree contains the original data at the leaves, and a filtered representation of […]
Dec, 7
Vector graphics depicting marbling flow
We present an efficient framework for generating marbled textures that can be exported into a vector graphics format based on an explicit surface tracking method (see Figure 1). The proposed method enables artists to create complex and realistic marbling textures that can be used for design purposes. Our algorithm is unique in that the marbling […]
Dec, 7
A Real-Time Multigrid Finite Hexahedra Method for Elasticity Simulation using CUDA
We present a multigrid approach for simulating elastic deformable objects in real time on recent NVIDIA GPU architectures. To accurately simulate large deformations we consider the co-rotated strain formulation. Our method is based on a finite element discretization of the deformable object using hexahedra. It draws upon recent work on multigrid schemes for the efficient […]
Dec, 7
GPU-based Monte Carlo simulation in neutron transport and finite differences heat equation evaluation
Graphics Processing Units (GPU) are high performance co-processors originally intended to improve the use and quality of computer graphics applications. Since researchers and practitioners realized the potential of using GPU for general purpose, their application has been extended to other fields out of computer graphics scope. The main objective of this work is to evaluate […]
Dec, 7
Simulation of Coarse-Grained Protein-Protein Interactions with Graphics Processing Units
We report a hybrid parallel central and graphics processing units (CPU-GPU) implementation of a coarse-grained model for replica exchange Monte Carlo (REMC) simulations of protein assemblies. We describe the design, optimization, validation, and benchmarking of our algorithms, particularly the parallelization strategy, which is specific to the requirements of GPU hardware. Performance evaluation of our hybrid […]
Dec, 6
Massive parallel LDPC decoding on GPU
Low-Density Parity-Check (LDPC) codes are powerful error correcting codes (ECC). They have recently been adopted by several data communication standards such as DVB-S2 and WiMax. LDPCs are represented by bipartite graphs, also called Tanner graphs, and their decoding demands very intensive computation. For that reason, VLSI dedicated architectures have been investigated and developed over the […]