4284

Posts

May, 31

Connected component identification and cluster update on GPU

Cluster identification tasks occur in a multitude of contexts in physics and engineering such as, for instance, cluster algorithms for simulating spin models, percolation simulations, segmentation problems in image processing, or network analysis. While it has been shown that graphics processing units (GPUs) can result in speedups of two to three orders of magnitude as […]
May, 30

Parallel ant colony for nonlinear function optimization with graphics hardware acceleration

This paper presents a massively parallel ant colony optimization – pattern search (ACO-PS) algorithm with graphics hardware acceleration on nonlinear function optimization problems. The objective of this study is to determine the effectiveness of using graphics processing units (GPU) as a hardware platform for ACO-PS. GPU, the common graphics hardware found in modern personal computers, […]
May, 30

A Compute Unified System Architecture for Graphics Clusters Incorporating Data Locality

We present a development environment for distributed GPU computing targeted for multi-GPU systems, as well as graphics clusters. Our system is based on CUDA and logically extends its parallel programming model for graphics processors to higher levels of parallelism, namely, the PCI bus and network interconnects. While the extended API mimics the full function set […]
May, 30

A Micro-benchmark Suite for AMD GPUs

Optimizing programs for Graphic Processing Unit (GPU) requires thorough knowledge about the values of architectural features for the new computing platform. However, this knowledge is frequently unavailable, e.g., due to insufficient documentation, which is probably a result of the infancy of general purpose computing on the GPU. What makes the modeling of program performance on […]
May, 30

Real-Time Rendering and Manipulation of Large Terrains

Terrains are challenging geometric objects for real-time rendering and interactive manipulation. State-of-the-art terrain rendering systems use custom, multi-resolution, representations like geometry clipmaps for fast rendering on the GPU. In this paper, we present a system that exploits the power and flexibility of the modern GPUs to store, render, and manipulate terrains with minimal CPU involvement. […]
May, 30

Fast Parallel Markov Clustering in Bioinformatics Using Massively Parallel Graphics Processing Unit Computing

Markov clustering is becoming a key algorithm with in bioinformatics for determining clusters in networks. For instance, clustering protein interaction networks is helping find genes implicated in diseases such as cancer. However, with fast sequencing and other technologies generating vast amounts of data on biological networks, performance and scalability issues are becoming a critical limiting […]
May, 30

Evaluating the potential of graphics processors for high performance embedded computing

Today’s high performance embedded computing applications are posing significant challenges for processing throughout. Traditionally, such applications have been realized on application specific integrated circuits (ASICs) and/or digital signal processors (DSP). However, ASICs’ advantage in performance and power often could not justify the fast increasing fabrication cost, while current DSP offers a limited processing throughput that […]
May, 30

High performance comparison-based sorting algorithm on many-core GPUs

Sorting is a kernel algorithm for a wide range of applications. In this paper, we present a new algorithm, GPU-Warpsort, to perform comparison-based parallel sort on Graphics Processing Units (GPUs). It mainly consists of a bitonic sort followed by a merge sort. Our algorithm achieves high performance by efficiently mapping the sorting tasks to GPU […]
May, 30

VHF SAR image formation implemented on a GPU

This paper will describe how off-the-shelf 3D graphics cards can be used for scientific computation like SAR processing. In particular, a highly efficient one-dimensional FFT and a fast direct (global) backprojection implementation will be presented and analyzed.
May, 30

GPU-Accelerated Background Generation Algorithm with Low Latency

A background model is constructed to detect moving objects in video sequences. Subsequent frames are stacked on top of each other using associated registration information that must be obtained in a preprocessing step. The change of the brightness value in each pixel over time might be caused by a moving object. Current graphics processing units […]
May, 30

Performance Acceleration of Kernel Polynomial Method Applying Graphics Processing Units

The Kernel Polynomial Method (KPM) is one of the fast diagonalization methods used for simulations of quantum systems in research fields of condensed matter physics and chemistry. The algorithm has a difficulty to be parallelized on a cluster computer or a supercomputer due to the fine-gain recursive calculations. This paper proposes an implementation of the […]
May, 29

GPU-based parallel-beam and cone-beam forward- and backprojection using CUDA

Analytic and iterative tomographic image reconstruction suffers from the time-consuming forward- and back-projection steps. We recently presented a set of acceleration techniques that improve the performance of forward- and backprojection on CPU (central processing unit) and CBE (cell broadband engine) -based platforms compared to straight-forward implementations by several orders of magnitude.

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org