4073

Posts

May, 12

Accelerating QDP++ using GPUs

Graphic Processing Units (GPUs) are getting increasingly important as target architectures in scientific High Performance Computing (HPC). NVIDIA established CUDA as a parallel computing architecture controlling and making use of the compute power of GPUs. CUDA provides sufficient support for C++ language elements to enable the Expression Template (ET) technique in the device memory domain. […]
May, 11

A GPU-based iterated tabu search for solving the quadratic 3-dimensional assignment problem

The quadratic 3-dimensional assignment problem (Q3AP) is an extension of the well-known NP-hard quadratic assignment problem. It has been proved to be one of the most difficult combinatorial optimization problems. Local search (LS) algorithms are a class of heuristics which have been successfully applied to solve such hard optimization problem. These methods handle with a […]
May, 11

Optimized GPU Framework for Speckle Reduction Using Histogram Matching and Region Growing

A GPU framework for ultrasound speckle reduction by region growing based on local statistics extracted from the histogram shape is presented. The required image processing is computationally intensive, involving histogram calculation, region growing, box filtering using different sizes of windows, and more. In this paper, we describe the use of a graphics processing unit for […]
May, 11

Real-Time Simulation and Rendering of 3D Smoke on GPU Programme

Natural scene simulation is always a hot issue in computer graphic. This paper aims at the physically-based simulation of smoke on graphic processing unit (GPU). The subject is on the background of an actual project. The simulation is based on the Stam’s semi-Lagrangian scheme; MacCormack scheme is used to solve the advection item of Navier-Stokes […]
May, 11

Fast GPU implementation of large scale dictionary and sparse representation based vision problems

Recently, Computer Vision problems like Face Recognition and Super-Resolution solved using sparse representation based methods with large dictionaries have shown state-of-the-art results. However such methods are computationally prohibitive for typical CPUs, especially for a large dictionary size. We present fast implementation of these methods by exploiting the massively parallel processing capabilities of a GPU within […]
May, 11

Where is the data? Why you cannot debate CPU vs. GPU performance without the answer

General purpose GPU Computing (GPGPU) has taken off in the past few years, with great promises for increased desktop processing power due to the large number of fast computing cores on high-end graphics cards. Many publications have demonstrated phenomenal performance and have reported speedups as much as 1000x over code running on multi-core CPUs. Other […]
May, 11

Fast JND-Based Video Carving With GPU Acceleration for Real-Time Video Retargeting

A recently developed image resizing technique, seam carving, has been proved to be a useful tool for content-adaptive spatially nonuniform image resizing with the purpose of optimal display on a screen of reduced resolution or different aspect ratio. In this paper, we present a fast algorithm for real-time content-aware video retargeting based on the improved […]
May, 11

Differential evolution algorithm on the GPU with C-CUDA

Several areas of knowledge are being benefited with the reduction of the computing time by using the technology of Graphics Processing Units (GPU) and the Compute Unified Device Architecture (CUDA) platform. In case of Evolutionary algorithms, which are inherently parallel, this technology may be advantageous for running experiments demanding high computing time. In this paper, […]
May, 11

Whole-function vectorization

Data-parallel programming languages are an important component in today’s parallel computing landscape. Among those are domain-specific languages like shading languages in graphics (HLSL, GLSL, RenderMan, etc.) and “general-purpose” languages like CUDA or OpenCL. Current implementations of those languages on CPUs solely rely on multi-threading to implement parallelism and ignore the additional intra-core parallelism provided by […]
May, 11

Gemma in April: A matrix-like parallel programming architecture on OpenCL

Nowadays, Graphics Processing Unit (GPU), as a kind of massive parallel processor, has been widely used in general purposed computing tasks. Although there have been mature development tools, it is not a trivial task for programmers to write GPU programs. Based on this consideration, we propose a novel parallel computing architecture. The architecture includes a […]
May, 11

High performance memetic algorithm particle filter for multiple object tracking on modern GPUs

This work presents an effective approach to visual tracking using a graphics processing unit (GPU) for computation purposes. In order to get a performance improvement against other platforms it is convenient to select proper algorithms such as population-based ones. They expose a parallel-friendly nature needing from many independent evaluations that map well to the parallel […]
May, 10

Data-aware scheduling of legacy kernels on heterogeneous platforms with distributed memory

In this paper, we describe a runtime to automatically enhance the performance of applications running on heterogeneous platforms consisting of a multi-core (CPU) and a throughput-oriented many-core (GPU). The CPU and GPU are connected by a non-coherent interconnect such as PCI-E, and as such do not have shared memory. Heterogeneous platforms available today such as […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: