Posts
Sep, 6
A Survey on GPU System Considering its Performance on Different Applications
In this paper we study NVIDIA graphics processing unit (GPU) along with its computational power and applications. Although these units are specially designed for graphics application we can employee there computation power for non graphics application too. GPU has high parallel processing power, low cost of computation and less time utilization; it gives good result […]
Sep, 6
Phase Aware Memory Scheduling
Computer architecture is at the brink of convergence with the integration of the general-purpose multi-core CPU architecture and the special purpose accelerated graphics architecture (GPU). Semiconductor giants like Intel and AMD have already brought to the market next-generation integrated heterogeneous processors in the form of the Sandy Bridge and the Fusion architecture respectively. However, with […]
Sep, 6
Skew Handling in Aggregate Streaming Queries on GPUs
Nowadays, the data to be processed by database systems has grown so large that any conventional, centralized technique is inadequate. At the same time, general purpose computation on GPU (GPGPU) recently has successfully drawn attention from the data management community due to its ability to achieve significant speed-ups at a small cost. Efficient skew handling […]
Sep, 6
Percolation study of samples on 2D lattices using GPUs
We study the percolation problem of sites on 2D lattices of various geometries, using general purpose graphic processing units (GPGPU). The implementation of a component labeling parallel algorithm in CUDA and their generalization to different geometries, is discussed. The results of performance for this algorithm on a GPU versus the corresponding sequential implementation of reference […]
Sep, 5
Efficient Implementation of RLS-Based Adaptive Filters on nVIDIA GeForce Graphics Processing Unit
This paper presents efficient implementation of RLS-based adaptive filters with a large number of taps on nVIDIA GeForce graphics processing unit (GPU) and CUDA software development environment. Modification of the order and the combination of calculations reduces the number of accesses to slow off-chip memory. Assigning tasks into multiple threads also takes memory access order […]
Sep, 5
Real-Time Motion Artifact Compensation for PMD-ToF Images
Time-of-Flight (ToF) cameras gained a lot of scientific attention and became a vivid field of research in the last years. A still remaining problem of ToF cameras are motion artifacts in dynamic scenes. This paper presents a new preprocessing method for a fast motion artifact compensation. We introduce a ow like algorithm that supports motion […]
Sep, 5
Work in Progress: Vortex Detection and Visualization for Design of Micro Air Vehicles and Turbomachinery
Vortex detection and visualization is an important technique for computational fluid dynamics (CFD) modelers and analysts. Since vortices are often not just local phenomena, algorithms for detecting the vortex core can be expanded by the use of streamline placement and termination methodologies to appropriately visualize the vortex. We are enhancing an existing VCDetect software tool […]
Sep, 5
Transparent CPU-GPU Collaboration for Data-Parallel Kernels on Heterogeneous Systems
Heterogeneous computing on CPUs and GPUs has traditionally used fixed roles for each device: the GPU handles data parallel work by taking advantage of its massive number of cores while the CPU handles non data-parallel work, such as the sequential code or data transfer management. Unfortunately, this work distribution can be a poor solution as […]
Sep, 5
GPU & CPU implementation of Young – Van Vliet’s Recursive Gaussian Smoothing Filter
This document describes an implementation for GPU and CPU of Young and Van Vliet’s recursive Gaussian smoothing as an external module for the Insight Toolkit ITK, version 4.* www.itk.org. In the absence of an OpenCL-capable platform, the code will run the CPU implementation as an alternative to the existing Deriche recursive Gaussian smoothing filter in […]
Sep, 4
Generation of the Scrambled Halton Sequence Using Accelerators
The Halton sequence is one of the most popular low-discrepancy sequences. In order to satisfy some practical requirements, the original sequence is usually modified in some way. The scrambling algorithm, proposed by Owen, has several theoretical advantages, but on the other hand is difficult to implement in practice due to the trade-off between high memory […]
Sep, 4
The discrete dipole approximation code DDscat.C++: features, limitations and plans
We present a new freely available open-source C++ software for numerical solution of the electromagnetic waves absorption and scattering problems within the Discrete Dipole Approximation paradigm. The code is based upon the famous and free Fortan-90 code DDSCAT by B. Draine and P. Flatau. Started as a teaching project, the presented code DDscat.C++ differs from […]
Sep, 4
Detecting multiple periodicities in observational data with the multi-frequency periodogram. II. Frequency Decomposer, a parallelized time-series analysis algorithm
This is a parallelized algorithm performing a decomposition of a noisy time series into a number of frequency components. The algorithm analyses all suspicious periodicities that can be revealed, including the ones that look like an alias or noise at a glance, but later may prove to be a real variation. After selection of the […]