Posts
Jul, 14
Design of a programmable micro-ultrasound research platform
To foster innovative uses of micro-ultrasound in biomedicine, it is beneficial to develop flexible research-purpose systems that allow researchers to easily reconfigure its system-level operations such as transmit firing sequence and receive processing. In this paper, we present the development of a programmable micro-ultrasound research platform that is capable of realizing various micro-imaging algorithms. The […]
Jul, 14
Fast parallel algorithm for audio content retrieval on GPUs
The search techniques audio content MIR (music information retrieval) face two major challenges: the robustness of the algorithm and the speed of this operation. In this article proposes a model of fast algorithm for the extraction of audio data by the fingerprinting technique, which is implemented on a CPU-based platform and then parallelized to run […]
Jul, 14
Fast and Efficient FPGA-Based Feature Detection Employing the SURF Algorithm
Feature detectors are schemes that locate and describe points or regions of ‘interest’ in an image. Today there are numerous machine vision applications needing efficient feature detectors that can work on Real-time; moreover, since this detection is one of the most time consuming tasks in several vision devices, the speed of the feature detection schemes […]
Jul, 14
Computing spike-based convolutions on GPUs
In spiking neural networks, asynchronous spike events are processed in parallel by neurons. Emulations of such networks are traditionally computed by CPUs or realized using dedicated neuromorphic hardware. In many neuromorphic systems, the address-event-representation (AER) is used for spike communication. In this paper we present the acceleration of AER based spike processing using a graphics […]
Jul, 14
A Sparse Matrix Personality for the Convey HC-1
In this paper we describe a double precision floating point sparse matrix-vector multiplier (SpMV) and its performance as implemented on a Convey HC-1 reconfigurable computer. The primary contributions of this work are a novel streaming reduction architecture for floating point accumulation, a novel on-chip cache optimized for streaming compressed sparse row (CSR) matrices, and end-to-end […]
Jul, 14
Fast circuit simulation on graphics processing units
SPICE based circuit simulation is a traditional workhorse in the VLSI design process. Given the pivotal role of SPICE in the IC design flow, there has been significant interest in accelerating SPICE. Since a large fraction (on average 75%) of the SPICE runtime is spent in evaluating transistor model equations, a significant speedup can be […]
Jul, 14
Efficient visual hull computation for real-time 3D reconstruction using CUDA
In this paper we present two efficient GPU-based visual hull computation algorithms. We compare them in terms of performance using image sets of varying size and different voxel resolutions. In addition, we present a real-time 3D reconstruction system which uses the proposed GPU-based reconstruction method to achieve real-time performance (30 fps) using 16 cameras and […]
Jul, 14
Option pricing with multi-dimensional quadrature architectures
Quadrature based methods for numerical integration provide a means of quickly and accurately pricing financial products such as options. These methods can be applied to multi-dimensional products, such as options on multiple underlying assets, but suffer from an exponential increase in computational complexity as the dimension increases. This paper examines the theoretical complexity of quadrature […]
Jul, 14
Video Coding on Multicore Graphics Processors
In this article, we investigate using multi-core graphics processing units (GPUs) for video encoding and decoding. After an overview of video coding and GPUs, we review some previous work on structuring video coding modules so that the massive parallel processing capability of GPUs can be harnessed. We also review previous work on partitioning the video […]
Jul, 14
Accelerating the Nonequispaced Fast Fourier Transform on Commodity Graphics Hardware
We present a fast parallel algorithm to compute the nonequispaced fast Fourier transform on commodity graphics hardware (the GPU). We focus particularly on a novel implementation of the convolution step in the transform as it was previously its most time consuming part. We describe the performance for two common sample distributions in medical imaging (radial […]
Jul, 14
Real-Time Depth-of-Field Rendering Using Anisotropically Filtered Mipmap Interpolation
This article presents a real-time GPU-based post-filtering method for rendering acceptable depth-of-field effects suited for virtual reality. Blurring is achieved by nonlinearly interpolating mipmap images generated from a pinhole image. Major artifacts common in the post-filtering techniques such as bilinear magnification artifact, intensity leakage, and blurring discontinuity are practically eliminated via magnification with a circular […]
Jul, 14
Large-scale transient stability simulation on graphics processing units
Graphics processing units (GPUs) have recently attracted a lot of interest in several fields struggling with massively large computation tasks. The application of a GPU for fast and accurate transient stability simulation of the large-scale power systems is presented in this paper. The computationally intensive parts of the simulation were offloaded to the GPU to […]