## Posts

Nov, 8

### Phase diagram and critical behavior of the square-lattice Ising model with competing nearest- and next-nearest-neighbor interactions

Using the parallel tempering algorithm and GPU accelerated techniques, we have performed large-scale Monte Carlo simulations of the Ising model on a square lattice with antiferromagnetic (repulsive) nearest-neighbor(NN) and next-nearest-neighbor(NNN) interactions of the same strength and subject to a uniform magnetic field. Both transitions from the (2×1) and row-shifted (2×2) ordered phases to the paramagnetic […]

Nov, 8

### High-performance cone beam reconstruction using CUDA compatible GPUs

Compute unified device architecture (CUDA) is a software development platform that allows us to run C-like programs on the nVIDIA graphics processing unit (GPU). This paper presents an acceleration method for cone beam reconstruction using CUDA compatible GPUs. The proposed method accelerates the Feldkamp, Davis, and Kress (FDK) algorithm using three techniques: (1) off-chip memory […]

Nov, 8

### Accelerating glassy dynamics using graphics processing units

Modern graphics hardware offers peak performances close to 1 Tflop/s, and NVIDIA’s CUDA provides a flexible and convenient programming interface to exploit these immense computing resources. We demonstrate the ability of GPUs to perform high-precision molecular dynamics simulations for nearly a million particles running stably over many days. Particular emphasis is put on the numerical […]

Nov, 8

### The CUBLAS and CULA based GPU acceleration of adaptive finite element framework for bioluminescence tomography

In molecular imaging (MI), especially the optical molecular imaging, bioluminescence tomography (BLT) emerges as an effective imaging modality for small animal imaging. The finite element methods (FEMs), especially the adaptive finite element (AFE) framework, play an important role in BLT. The processing speed of the FEMs and the AFE framework still needs to be improved, […]

Nov, 8

### Parallelized computation for computer simulation of electrocardiograms using personal computers with multi-core CPU and general-purpose GPU

Biological computations like electrocardiological modelling and simulation usually require high-performance computing environments. This paper introduces an implementation of parallel computation for computer simulation of electrocardiograms (ECGs) in a personal computer environment with an Intel CPU of Core (TM) 2 Quad Q6600 and a GPU of Geforce 8800GT, with software support by OpenMP and CUDA. It […]

Nov, 8

### Theory of square, rectangular, and microband electrodes through explicit GPU simulation

The use of microband electrodes in electrochemistry has expanded in recent years due to enhanced current densities, ease of fabrication, and available theory. This paper, through explicit three-dimensional finite difference GPU simulation, simulates mass transport to square and rectangular (finite band) microelectrodes and quantifies the response of a finite band at any given length to […]

Nov, 8

### Towards Dense Linear Algebra for Hybrid GPU Accelerated Manycore Systems

We highlight the trends leading to the increased appeal of using hybrid multicore+GPU systems for high performance computing. We present a set of techniques that can be used to develop efficient dense linear algebra algorithms for these systems. We illustrate the main ideas with the development of a hybrid LU factorization algorithm where we split […]

Nov, 8

### Bayesian real-time perception algorithms on GPU

In this text we present the real-time implementation of a Bayesian framework for robotic multisensory perception on a graphics processing unit (GPU) using the Compute Unified Device Architecture (CUDA). As an additional objective, we intend to show the benefits of parallel computing for similar problems (i.e. probabilistic grid-based frameworks), and the user-friendly nature of CUDA […]

Nov, 8

### Large-scale mixer simulations using massively parallel GPU architectures

Granular flows are extremely important for the pharmaceutical and chemical industry, as well as for other scientific areas. Thus, the understanding of the impact of particle size and related effects on the mean, as well as on the fluctuating flow field, in granular flows is critical for design and optimization of powder processing operations. We […]

Nov, 8

### Parallel Banding Algorithm to compute exact distance transform with the GPU

We propose a Parallel Banding Algorithm (PBA) on the GPU to compute the exact Euclidean Distance Transform (EDT) for a binary image in 2D and higher dimensions. Partitioning the image into small bands to process and then merging them concurrently, PBA computes the exact EDT with optimal linear total work, high level of parallelism and […]

Nov, 7

### COMPASS: a programmable data prefetcher using idle GPU shaders

A traditional fixed-function graphics accelerator has evolved into a programmable general-purpose graphics processing unit over the last few years. These powerful computing cores are mainly used for accelerating graphics applications or enabling low-cost scientific computing. To further reduce the cost and form factor, an emerging trend is to integrate GPU along with the memory controllers […]

Nov, 7

### Multifactor dimensionality reduction for graphics processing units enables genome-wide testing of epistasis in sporadic ALS

MOTIVATION: Epistasis, the presence of gene-gene interactions, has been hypothesized to be at the root of many common human diseases, but current genome-wide association studies largely ignore its role. Multifactor dimensionality reduction (MDR) is a powerful model-free method for detecting epistatic relationships between genes, but computational costs have made its application to genome-wide data difficult. […]