1197

Posts

Oct, 29

An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness

GPU architectures are increasingly important in the multi-core era due to their high number of parallel processors. Programming thousands of massively parallel threads is a big challenge for software engineers, but understanding the performance bottlenecks of those parallel programs on GPU architectures to improve application performance is even more difficult. Current approaches rely on programmers […]
Oct, 29

GPU-based Video Feature Tracking and Matching

Abstract This paper describes novel implementations of the KLT feature tracking and SIFT feature extraction algorithms that run on the graphics processing unit (GPU) and is suitable for video analysis in real-time vision systems. While significant acceleration over standard CPU implementations is obtained by exploiting parallelism provided by modern programmable graphics hardware, the CPU is […]
Oct, 28

GPU acceleration of cutoff pair potentials for molecular modeling applications

The advent of systems biology requires the simulation of ever-larger biomolecular systems, demanding a commensurate growth in computational power. This paper examines the use of the NVIDIA Tesla C870 graphics card programmed through the CUDA toolkit to accelerate the calculation of cutoff pair potentials, one of the most prevalent computations required by many different molecular […]
Oct, 28

Real-time eye blink detection with GPU-based SIFT tracking

This paper reports on the implementation of a GPUbased, real-time eye blink detector on very low contrast images acquired under near-infrared illumination. This detector is part of a multi-sensor data acquisition and analysis system for driver performance assessment and training. Eye blinks are detected inside regions of interest that are aligned with the subject’s eyes […]
Oct, 28

AUTO-GC: Automatic translation of data mining applications to GPU clusters

Because of the very favorable price to performance ratio of the GPUs, a popular parallel programming configuration today is a cluster of GPUs. However, extracting performance on such a configuration would typically require programming in both MPI and CUDA, thus requiring a high degree of expertise and effort. It is clearly desirable to be able […]
Oct, 28

GPU-based streaming architectures for fast cone-beam CT image reconstruction and demons deformable registration

This paper shows how to significantly accelerate cone-beam CT reconstruction and 3D deformable image registration using the stream-processing model. We describe data-parallel designs for the Feldkamp, Davis and Kress (FDK) reconstruction algorithm, and the demons deformable registration algorithm, suitable for use on a commodity graphics processing unit. The streaming versions of these algorithms are implemented […]
Oct, 28

GPU Gems: Programming Techniques, Tips and Tricks for Real-Time Graphics

GPU Gems has won a prestigious Front Line Award from Game Developer Magazine. The Front Line Awards recognize products that enable faster and more efficient game development, advancing the state of the art.FULL COLOR THROUGHOUT! “This collection of articles is particularly impressive for its depth and breadth. The book includes product-oriented case studies, previously unpublished […]
Oct, 28

CUDA cuts: Fast graph cuts on the GPU

Graph cuts has become a powerful and popular optimization tool for energies defined over an MRF and have found applications in image segmentation, stereo vision, image restoration, etc. The maxflow/mincut algorithm to compute graph-cuts is computationally heavy. The best-reported implementation of graph cuts takes over 100 milliseconds even on images of size 640×480 and cannot […]
Oct, 28

Browsing a Large Collection of Community Photos Based on Similarity on GPU

A novel approach is proposed in this paper to facilitate browsing a large collection of community photos based on visual similarities. Using extracted feature vectors, the approach maps photos onto a 2D rectangular area such that the ones with similar features are close to each other. When a user browses the collection, a subset of […]
Oct, 28

GPU packet classification using OpenCL: a consideration of viable classification methods

Packet analysis is an important aspect of network security, which typically relies on a flexible packet filtering system to extrapolate important packet information from each processed packet. Packet analysis is a computationally intensive, highly parallelisable task, and as such, classification of large packet sets, such as those collected by a network telescope, can require significant […]
Oct, 28

An adaptive performance modeling tool for GPU architectures

This paper presents an analytical model to predict the performance of general-purpose applications on a GPU architecture. The model is designed to provide performance information to an auto-tuning compiler and assist it in narrowing down the search to the more promising implementations. It can also be incorporated into a tool to help programmers better assess […]
Oct, 28

An efficient implementation of Smith Waterman algorithm on GPU using CUDA, for massively parallel scanning of sequence databases

The Smith Waterman algorithm for sequence alignment is one of the main tools of bioinformatics. It is used for sequence similarity searches and alignment of similar sequences. The high end Graphical Processing Unit (GPU), used for processing graphics on desktop computers, deliver computational capabilities exceeding those of CPUs by an order of magnitude. Recently these […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org