Posts
Jun, 14
Real-time numerical dispersion compensation using graphics processing unit for Fourier-domain optical coherence tomography
Numerical dispersion compensation for both standard and full-range Fourier-domain optical coherence tomography (FD-OCT) on the graphics processing unit (GPU) architecture has been implemented. The data acquisition, processing and image display were performed on a multi-thread, CPU-GPU heterogeneous computing system. The real-time ultra-high-resolution full-range complex-conjugate-free FD-OCT imaging was demonstrated at 68.4 frame/s with a frame size […]
Jun, 14
FPGA Based High Performance and Scalable Block LU Decomposition Architecture
Decomposition of a matrix into lower and upper triangular matrices (LU decomposition) is a vital part of many scientific and engineering applications, and the block LU decomposition algorithm is an approach well suited to parallel hardware implementation. This paper presents an approach to speed up implementation of the block LU decomposition algorithm using FPGA hardware. […]
Jun, 14
Parallelizing Peptide-Spectrum scoring using modern graphics processing units
Tandem mass spectrometry is a powerful experimental tool used in molecular biology to determine the composition of protein mixtures. In a tandem mass experiment, peptide ion selection algorithms generally select only the most abundant peptide ions for further fragmentation. Because of this, the low-abundance proteins in a sample rarely get identified. A Real-Time Peptide-Spectrum Matching […]
Jun, 14
A Highly Scalable Solution of an NP-Complete Problem Using CUDA
NP Complete problems are one of the most complex problems in computer science but their vast applications in real world always pushes the scientists to explore new ways to solve them. We extended the original problem definition of Boolean Satisfiability Problem to finding all satisfiable solutions of a given problem instance and used massively parallel […]
Jun, 14
In Situ Power Analysis of General Purpose Graphical Processing Units
In this paper, an in situ power analysis profiling over time for general purpose graphics processing units (GPGPU) is presented. Based on this method the power consumption of different modes of operations like data transfer between GPU and host CPU, basic single precision floating point arithmetic operations (addition, subtraction, multiplication) on the multiprocessor units and […]
Jun, 14
Realistic real-time rendering for large-scale forest scenes
Fast rendering of a large-scale forest landscape scene is important in many applications, as video games, Internet graphics applications, landscape or cityscape scene design and visualization, and virtual forestry. A challenge in virtual reality is realistic rendering of large scale scenes consisting of complex plant models. A series of level of detail tree models are […]
Jun, 14
An FPGA Implementation of Information Theoretic Visual-Saliency System and Its Optimization
Biological vision systems use saliency-based visual attention mechanisms to limit higher-level vision processing on the most visually-salient subsets of an input image. Among several computational models that capture the visual-saliency in biological system, an information theoretic AIM(Attention based on Information Maximization) algorithm has been demonstrated to predict human gaze patterns better than other existing models. […]
Jun, 14
A comparative study of GPU programming models and architectures using neural networks
Recently, General Purpose Graphical Processing Units (GP-GPUs) have been identified as an intriguing technology to accelerate numerous data-parallel algorithms. Several GPU architectures and programming models are beginning to emerge and establish their niche in the High-Performance Computing (HPC) community. New massively parallel architectures such as the Nvidia’s Fermi and AMD/ATi’s Radeon pack tremendous computing power […]
Jun, 14
A Tuned and Scalable Fast Multipole Method as a Preeminent Algorithm for Exascale Systems
Achieving computing at the exascale means accelerating today’s applications by one thousand times. Clearly, this cannot be accomplished by hardware alone, at least not in the short time frame expected for reaching this performance milestone. Thus, a lively discussion has begun in the last couple of years about programming models, software components and tools, and […]
Jun, 14
A sparse octree gravitational N-body code that runs entirely on the GPU processor
We present parallel algorithms for constructing and traversing sparse octrees on graphics processing units (GPUs). The algorithms are based on parallel-scan and sort methods. To test the performance and feasibility, we implemented them in CUDA in the form of a gravitational tree-code which completely runs on the GPU.(The code is publicly available at: http://castle.strw.leidenuniv.nl/software.html) The […]
Jun, 14
Design Exploration of Quadrature Methods in Option Pricing
This paper presents a novel parallel architecture for accelerating quadrature methods used for pricing complex multi-dimensional options, such as discrete barrier, Bermudan and American options. We explore different designs of the quadrature evaluation core including optimized pipelined hardware designs in reconfigurable logic and a compute unified device architecture (CUDA)-based graphics processing unit (GPU) design. A […]
Jun, 14
Accelerating Parameter Sweep Applications Using CUDA
This paper proposes a parallelization scheme for parameter sweep (PS) applications using the compute unified device architecture (CUDA). Our scheme focuses on PS applications with irregular access patterns, which usually result in lower performance on the GPU. The key idea to resolve this irregularity is to exploit the similarity of data accesses between different parameters. […]