## Posts

Nov, 8

### Large-scale mixer simulations using massively parallel GPU architectures

Granular flows are extremely important for the pharmaceutical and chemical industry, as well as for other scientific areas. Thus, the understanding of the impact of particle size and related effects on the mean, as well as on the fluctuating flow field, in granular flows is critical for design and optimization of powder processing operations. We […]

Nov, 8

### Parallel Banding Algorithm to compute exact distance transform with the GPU

We propose a Parallel Banding Algorithm (PBA) on the GPU to compute the exact Euclidean Distance Transform (EDT) for a binary image in 2D and higher dimensions. Partitioning the image into small bands to process and then merging them concurrently, PBA computes the exact EDT with optimal linear total work, high level of parallelism and […]

Nov, 7

### COMPASS: a programmable data prefetcher using idle GPU shaders

A traditional fixed-function graphics accelerator has evolved into a programmable general-purpose graphics processing unit over the last few years. These powerful computing cores are mainly used for accelerating graphics applications or enabling low-cost scientific computing. To further reduce the cost and form factor, an emerging trend is to integrate GPU along with the memory controllers […]

Nov, 7

### Multifactor dimensionality reduction for graphics processing units enables genome-wide testing of epistasis in sporadic ALS

MOTIVATION: Epistasis, the presence of gene-gene interactions, has been hypothesized to be at the root of many common human diseases, but current genome-wide association studies largely ignore its role. Multifactor dimensionality reduction (MDR) is a powerful model-free method for detecting epistatic relationships between genes, but computational costs have made its application to genome-wide data difficult. […]

Nov, 7

### Parallel hyperbolic PDE simulation on clusters: Cell versus GPU

Increasingly, high-performance computing is looking towards data-parallel computational devices to enhance computational performance. Two technologies that have received significant attention are IBM’s Cell Processor and NVIDIA’s CUDA programming model for graphics processing unit (GPU) computing. In this paper we investigate the acceleration of parallel hyperbolic partial differential equation simulation on structured grids with explicit time […]

Nov, 7

### PIConGPU: A Fully Relativistic Particle-in-Cell Code for a GPU Cluster

The particle-in-cell (PIC) algorithm is one of the most widely used algorithms in computational plasma physics. With the advent of graphical processing units (GPUs), large-scale plasma simulations on inexpensive GPU clusters are in reach. We present an implementation of a fully relativistic plasma PIC algorithm for GPUs based on the NVIDIA CUDA library. It supports […]

Nov, 7

### Parallel implementation of a spatio-temporal visual saliency model

The human vision has been studied deeply in the past years, and several different models have been proposed to simulate it on computer. Some of these models concerns visual saliency which is potentially very interesting in a lot of applications like robotics, image analysis, compression, video indexing. Unfortunately they are compute intensive with tight real-time […]

Nov, 7

### Real-time multi-agent path planning on arbitrary surfaces

Path planning is an active topic in the literature, and efficient navigation over non-planar surfaces is an open research question. In this work we present a novel technique for navigation of multiple agents over arbitrary triangular domains. The proposed solution uses a fast hierarchical computation of geodesic distances over triangular meshes to allow interactive frame […]

Nov, 7

### Real-time path-based surface detail

We present a GPU algorithm to render path-based 3D surface detail in real-time. Our method models these features using a vector representation that is efficiently stored in two textures. First texture is used to specify the position of the features, while the second texture contains their paths, profiles and material information. A fragment shader is […]

Nov, 7

### Performance Analysis of General-Purpose Computation on Commodity Graphics Hardware: A Case Study Using Bioinformatics

Using modern graphics processing units for no-graphics high performance computing is motivated by their enhanced programmability, attractive cost/performance ratio and incredible growth in speed. Although the pipeline of a modern graphics processing unit (GPU) permits high throughput and more concurrency, they bring more complexities in analyzing the performance of GPU-based applications. In this paper, we […]

Nov, 7

### Implementation of a High Throughput Soft MIMO Detector on GPU

Multiple-input multiple-output (MIMO) significantly increases the throughput of a communication system by employing multiple antennas at the transmitter and the receiver. To extract maximum performance from a MIMO system, a computationally intensive search based detector is needed. To meet the challenge of MIMO detection, typical suboptimal MIMO detectors are ASIC or FPGA designs. We aim […]

Nov, 7

### GPU Acceleration for General Conservation Equations and its Application to several Engineering Problems

Presented is a general method for conservation equations called SHLL (split HLL) applied using Graphics Processing Unit (GPU) acceleration. The SHLL method is a purely vector-split approximation of the classical HLL method [Harten, Lax and van Leer, 1983] which assumes the presence of local wave propagation in the algabraic derivation of fluxes across cell surfaces. […]