9675

Posts

Jun, 13

Real-time planar flow velocity measurements using an optical flow algorithm implemented on GPU

This paper presents a high speed implementation of an optical flow algorithm which computes planar velocity fields in an experimental flow. Real-time computation of the flow velocity field allows the experimentalist to have instantaneous access to quantitative features of the flow. This can be very useful in many situations: fast evaluation of the performances and […]
Jun, 12

The Hierarchical Memory Machine Model for GPUs

The Discrete Memory Machine (DMM) and the Unified Memory Machine (UMM) are theoretical parallel computing models that capture the essence of the shared memory access and the global memory access of GPUs. The main contribution of this paper is to introduce the Hierarchical Memory Machine (HMM), which consists of multiple DMMs and a single UMM. […]
Jun, 12

FastSpMM: An Efficient Library for Sparse Matrix Matrix Product on GPUs

Sparse matrix matrix (SpMM) multiplication is involved in a wide range of scientific and technical applications. The computational requirements for this kind of operation are enormous, especially for large matrices. This paper analyzes and evaluates a method to efficiently compute the SpMM product in a computing environment that includes graphics processing units (GPUs). Some libraries […]
Jun, 12

FFT-SPA Non-Binary LDPC Decoding on GPU

It is well known that non-binary LDPC codes outperform the BER performance of binary LDPC codes for the same code length. The superior BER performance of non-binary codes comes at the expense of more complex decoding algorithms that demand higher computational power. In this paper, we propose parallel signal processing algorithms for performing the FFT-SPA […]
Jun, 12

OpenCL Implementation of a Color Based Object Tracking

In this paper we present an algorithm for realtime object tracking based on color. Firstly, a two-layer perceptron is trained aimed at coping with scene illumination changes. Based on this training, a piece of OpenCL code is generated for the purpose of harnessing the power of GPU computing. Then, color based object tracking is done […]
Jun, 12

Performance of a GPU-based Direct Summation Algorithm for Computation of Small Angle Scattering Profile

Small Angle Scattering (SAS) of X-rays or neutrons is an experimental technique that provides valuable structural information for biological macromolecules under physiological conditions and with no limitation on the molecular size. In order to refine molecular structure against experimental SAS data, ab initio prediction of the scattering profile must be recomputed hundreds of thousands of […]
Jun, 10

OCLoptimizer: An Iterative Optimization Tool for OpenCL

Nowadays, computers include several computational devices with parallel capacities, such as multicore processors and Graphic Processing Units (GPUs). OpenCL enables the programming of all these kinds of devices. An OpenCL program consists of a host code which discovers the computational devices available in the host system and it queues up commands to the devices, and […]
Jun, 10

Accelerating Genetic Programming Using Graphics Processing Units

Evolution through natural selection offers the possibility of automatically generating functionally complex solutions to a wide range of problems. Methods such as Genetic Programming (GP) show the promise of this approach but tend to stagnate after relatively few generations. To research this issue, execution speed must be substantially improved. This thesis presents work to accelerate […]
Jun, 10

Processing XPath Structural Constraints on GPU

Technologies such as CUDA and OpenCL have popularized the usage of graphics cards (GPUs) for general purpose programming, often with impressive performance gains. However, using such cards for speeding up XML Databases processing is yet to be fully explored. XML databases offer much flexibility for Web-oriented systems. Nonetheless, such flexibility comes at a considerable computational […]
Jun, 10

A flexible algorithm for calculating pair interactions on SIMD architectures

Calculating interactions or correlations between pairs of particles is typically the most time-consuming task in particle simulation or correlation analysis. Straightforward implementations using a double loop over particle pairs have traditionally worked well, especially since compilers usually do a good job of unrolling the inner loop. In order to reach high performance on modern CPU […]
Jun, 10

Recent Advances on GPU Computing in Operations Research

In the last decade, Graphics Processing Units (GPUs) have gained an increasing popularity as accelerators for High Performance Computing (HPC) applications. Recent GPUs are not only powerful graphics engines but also highly threaded parallel computing processors that can achieve sustainable speedup as compared with CPUs. In this context, researchers try to exploit the capability of […]
Jun, 9

GPU Acceleration of Algebraic Multigrid for Low-Frequency Finite Element Methods

This paper introduces a GPU acceleration of a Wavelet-based Algebraic Multigrid used as preconditioner for solving the Laplace’s equation discretized by Finite Element Method. We conduct some tests using a CPU-based direct solver, a CPU-based Preconditined Conjugate Gradient (PCG), and a GPU-based PCG. Finally, we report the solution time and the speed-up achieved in solving […]

Recent source codes

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: