Posts
Aug, 15
Parallel Morphological Endmember Extraction Using Commodity Graphics Hardware
Spatial/spectral algorithms have been shown in previous work to be a promising approach to the problem of extracting image end members from remotely sensed hyperspectral data. Such algorithms map nicely on high-performance systems such as massively parallel clusters and networks of computers. Unfortunately, these systems are generally expensive and difficult to adapt to onboard data […]
Aug, 15
A novel FPGA-based SVM classifier
Support Vector Machines (SVMs) are a powerful supervised learning tool, providing state-of-the-art accuracy at a cost of high computational complexity. The SVM classification suffers from linear dependencies on the number of the Support Vectors and the problem’s dimensionality. In this work, we propose a scalable FPGA architecture for the acceleration of SVM classification, which exploits […]
Aug, 15
Equalizer: A Scalable Parallel Rendering Framework
Continuing improvements in CPU and GPU performances as well as increasing multi-core processor and cluster-based parallelism demand for flexible and scalable parallel rendering solutions that can exploit multipipe hardware accelerated graphics. In fact, to achieve interactive visualization, scalable rendering systems are essential to cope with the rapid growth of data sets. However, parallel rendering systems […]
Aug, 15
Volumetric Ambient Occlusion
This paper presents a new GPU-based algorithm to compute ambient occlusion. We first examine how ambient occlusion is related to the physically founded rendering equation. The correspondence is made by introducing a fuzzy membership function that defines what “near occlusions” mean. Then we develop a method to calculate ambient occlusion in real-time without any pre-computation. […]
Aug, 15
Fast parallel simulation of fiber optical communication systems accelerated by a graphics processing unit
A parallel implementation of the split-step Fourier method utilizing the general purpose parallel computing architecture for graphics processing units CUDA is presented. Results of the GPU-implementation are compared to a conventional CPU-based approach regarding computation time and accuracy. We developed a novel implementation with a significantly higher accuracy than the CUDA intrinsic FFT in single […]
Aug, 12
Calculation of fermion loops for eta-prime and nucleon scalar and electromagnetic form factors
The exact evaluation of the disconnected diagram contributions to the flavor-singlet pseudoscalar meson mass, the nucleon sigma term and the nucleon electromagnetic form factors, is carried out utilizing GPGPU technology with the NVIDIA CUDA platform. The disconnected loops are also computed using stochastic methods with several noise reduction techniques. Various dilution schemes as well as […]
Aug, 12
Using the physics-based rendering toolkit for medical reconstruction
In this paper we cast the problem of tomography in the realm of computer graphics. By using PBRT (physically based rendering toolkit) we create a scripting environment that simplifies the programming of tomography algorithms such as maximum-likelihood expectation maximization (ML-EM) or simultaneous algebraic reconstruction technique (SART, a deviant of ART). This allows the rapid development […]
Aug, 12
The Sharing Tracker: Using Ideas from Cache Coherence Hardware to Reduce Off-Chip Memory Traffic with Non-Coherent Caches
Graphics Processing Units (GPUs) have recently emerged as a new platform for high performance, general-purpose computing. Because current GPUs employ deep multithreading to hide latency, they only have small, per-core caches to capture reuse and eliminate unnecessary off-chip accesses. This paper shows that for general-purpose workloads, the ability to copy cache lines between private caches […]
Aug, 12
Network-on-Chip Hardware Accelerators for Biological Sequence Alignment
The most pervasive compute operation carried out in almost all bioinformatics applications is pairwise sequence homology detection (or sequence alignment). Due to exponentially growing sequence databases, computing this operation at a large-scale is becoming expensive. An effective approach to speed up this operation is to integrate a very high number of processing elements in a […]
Aug, 12
Data Parallelism Exploiting for H.264 Encoder
Real-time H.264 encoding of high-definition (HD) video (up to 1080p) is a challenge workload to most existing programmable processors. Instead, the novel programmable parallel processors such as stream processor, Graphic processor unit (GPU) and DSP offer a different and very promising technology for these demands. Thus, parallel computing for H.264 encoding on these processors is […]
Aug, 12
Swept Volume approximation of polygon soups
We present a fast GPU-based algorithm to approximate the swept volume (SV) boundary of arbitrary polygon soup models. Despite the extensive research on calculating the volume swept by an object along a trajectory, the efficient algorithms described have imposed constraints on both the trajectories and geometric models. By proposing a general algorithm that handles flat […]
Aug, 12
Cardiac tissue simulation using graphics hardware
As video cards become faster and more programmable, physical simulations implemented on graphics processors become possible. This paper examines different programmable stages of the nVidia graphics processor (GPLI), and their use to simulate electrical activation of cells in a tissue sample using a cellular automaton model. Comparable tissue simulation programs were written to run on […]