A Survey of Neural Computation on Graphics Processing Hardware

Modern graphics processing units (GPU) are used for much more than simply 3D graphics applications. From machine vision to finite element analysis, CPU’s are being used in diverse applications, collectively called general purpose graphics processor utilization. This paper explores the capabilities and limitations of modern GPU’s and surveys the neural computation technologies that have been […]
Fat vs. Thin Threading Approach on GPUs: Application to Stochastic Simulation of Chemical Reactions

We explore two different threading approaches on a graphics processing unit (GPU) exploiting two different characteristics of the current GPU architecture. The fat thread approach tries to minimise data access time by relying on shared memory and registers potentially sacrificing parallelism. The thin thread approach maximises parallelism and tries to hide access latencies. We apply […]
Design of a programmable micro-ultrasound research platform

To foster innovative uses of micro-ultrasound in biomedicine, it is beneficial to develop flexible research-purpose systems that allow researchers to easily reconfigure its system-level operations such as transmit firing sequence and receive processing. In this paper, we present the development of a programmable micro-ultrasound research platform that is capable of realizing various micro-imaging algorithms. The […]
Fast parallel algorithm for audio content retrieval on GPUs

The search techniques audio content MIR (music information retrieval) face two major challenges: the robustness of the algorithm and the speed of this operation. In this article proposes a model of fast algorithm for the extraction of audio data by the fingerprinting technique, which is implemented on a CPU-based platform and then parallelized to run […]
Fast and Efficient FPGA-Based Feature Detection Employing the SURF Algorithm

Feature detectors are schemes that locate and describe points or regions of ‘interest’ in an image. Today there are numerous machine vision applications needing efficient feature detectors that can work on Real-time; moreover, since this detection is one of the most time consuming tasks in several vision devices, the speed of the feature detection schemes […]
Computing spike-based convolutions on GPUs

In spiking neural networks, asynchronous spike events are processed in parallel by neurons. Emulations of such networks are traditionally computed by CPUs or realized using dedicated neuromorphic hardware. In many neuromorphic systems, the address-event-representation (AER) is used for spike communication. In this paper we present the acceleration of AER based spike processing using a graphics […]
A Sparse Matrix Personality for the Convey HC-1

In this paper we describe a double precision floating point sparse matrix-vector multiplier (SpMV) and its performance as implemented on a Convey HC-1 reconfigurable computer. The primary contributions of this work are a novel streaming reduction architecture for floating point accumulation, a novel on-chip cache optimized for streaming compressed sparse row (CSR) matrices, and end-to-end […]
Fast circuit simulation on graphics processing units

SPICE based circuit simulation is a traditional workhorse in the VLSI design process. Given the pivotal role of SPICE in the IC design flow, there has been significant interest in accelerating SPICE. Since a large fraction (on average 75%) of the SPICE runtime is spent in evaluating transistor model equations, a significant speedup can be […]
Efficient visual hull computation for real-time 3D reconstruction using CUDA

In this paper we present two efficient GPU-based visual hull computation algorithms. We compare them in terms of performance using image sets of varying size and different voxel resolutions. In addition, we present a real-time 3D reconstruction system which uses the proposed GPU-based reconstruction method to achieve real-time performance (30 fps) using 16 cameras and […]
Option pricing with multi-dimensional quadrature architectures

Quadrature based methods for numerical integration provide a means of quickly and accurately pricing financial products such as options. These methods can be applied to multi-dimensional products, such as options on multiple underlying assets, but suffer from an exponential increase in computational complexity as the dimension increases. This paper examines the theoretical complexity of quadrature […]
Video Coding on Multicore Graphics Processors

In this article, we investigate using multi-core graphics processing units (GPUs) for video encoding and decoding. After an overview of video coding and GPUs, we review some previous work on structuring video coding modules so that the massive parallel processing capability of GPUs can be harnessed. We also review previous work on partitioning the video […]
Accelerating the Nonequispaced Fast Fourier Transform on Commodity Graphics Hardware

We present a fast parallel algorithm to compute the nonequispaced fast Fourier transform on commodity graphics hardware (the GPU). We focus particularly on a novel implementation of the convolution step in the transform as it was previously its most time consuming part. We describe the performance for two common sample distributions in medical imaging (radial […]
