Posts
Dec, 7
BSGP: bulk-synchronous GPU programming
We present BSGP, a new programming language for general purpose computation on the GPU. A BSGP program looks much the same as a sequential C program. Programmers only need to supply a bare minimum of extra information to describe parallel processing on GPUs. As a result, BSGP programs are easy to read, write, and maintain. […]
Dec, 7
Pangaea: a tightly-coupled IA32 heterogeneous chip multiprocessor
Moore’s Law and the drive towards performance efficiency have led to the on-chip integration of general-purpose cores with special-purpose accelerators. Pangaea is a heterogeneous CMP design for non-rendering workloads that integrates IA32 CPU cores with non-IA32 GPU-class multi-cores, extending the current state-of-the-art CPU-GPU integration that physically “fuses” existing CPU and GPU designs. Pangaea introduces (1) […]
Dec, 7
A single-pass GPU ray casting framework for interactive out-of-core rendering of massive volumetric datasets
We present an adaptive out-of-core technique for rendering massive scalar volumes employing single-pass GPU ray casting. The method is based on the decomposition of a volumetric dataset into small cubical bricks, which are then organized into an octree structure maintained out-of-core. The octree contains the original data at the leaves, and a filtered representation of […]
Dec, 7
Vector graphics depicting marbling flow
We present an efficient framework for generating marbled textures that can be exported into a vector graphics format based on an explicit surface tracking method (see Figure 1). The proposed method enables artists to create complex and realistic marbling textures that can be used for design purposes. Our algorithm is unique in that the marbling […]
Dec, 7
A Real-Time Multigrid Finite Hexahedra Method for Elasticity Simulation using CUDA
We present a multigrid approach for simulating elastic deformable objects in real time on recent NVIDIA GPU architectures. To accurately simulate large deformations we consider the co-rotated strain formulation. Our method is based on a finite element discretization of the deformable object using hexahedra. It draws upon recent work on multigrid schemes for the efficient […]
Dec, 7
GPU-based Monte Carlo simulation in neutron transport and finite differences heat equation evaluation
Graphics Processing Units (GPU) are high performance co-processors originally intended to improve the use and quality of computer graphics applications. Since researchers and practitioners realized the potential of using GPU for general purpose, their application has been extended to other fields out of computer graphics scope. The main objective of this work is to evaluate […]
Dec, 7
Simulation of Coarse-Grained Protein-Protein Interactions with Graphics Processing Units
We report a hybrid parallel central and graphics processing units (CPU-GPU) implementation of a coarse-grained model for replica exchange Monte Carlo (REMC) simulations of protein assemblies. We describe the design, optimization, validation, and benchmarking of our algorithms, particularly the parallelization strategy, which is specific to the requirements of GPU hardware. Performance evaluation of our hybrid […]
Dec, 6
Massive parallel LDPC decoding on GPU
Low-Density Parity-Check (LDPC) codes are powerful error correcting codes (ECC). They have recently been adopted by several data communication standards such as DVB-S2 and WiMax. LDPCs are represented by bipartite graphs, also called Tanner graphs, and their decoding demands very intensive computation. For that reason, VLSI dedicated architectures have been investigated and developed over the […]
Dec, 6
GPU-MEME: Using Graphics Hardware to Accelerate Motif Finding in DNA Sequences
Discovery of motifs that are repeated in groups of biological sequences is a major task in bioinformatics. Iterative methods such as expectation maximization (EM) are used as a common approach to find such patterns. However, corresponding algorithms are highly compute-intensive due to the small size and degenerate nature of biological motifs. Runtime requirements are likely […]
Dec, 6
Parallel SimRank computation on large graphs with iterative aggregation
Recently there has been a lot of interest in graph-based analysis. One of the most important aspects of graph-based analysis is to measure similarity between nodes in a graph. SimRank is a simple and influential measure of this kind, based on a solid graph theoretical model. However, existing methods on SimRank computation suffer from two […]
Dec, 6
BLAS Comparison on FPGA, CPU and GPU
High Performance Computing (HPC) or scientific codes are being executed across a wide variety of computing platforms from embedded processors to massively parallel GPUs. We present a comparison of the Basic Linear Algebra Subroutines (BLAS) using double-precision floating point on an FPGA, CPU and GPU. On the CPU and GPU, we utilize standard libraries on […]
Dec, 6
Skinning with dual quaternions
Skinning of skeletally deformable models is extensively used for real-time animation of characters, creatures and similar objects. The standard solution, linear blend skinning, has some serious drawbacks that require artist intervention. Therefore, a number of alternatives have been proposed in recent years. All of them successfully combat some of the artifacts, but none challenge the […]