8344

Posts

Sep, 24

Sound Speed Optimization Using Image Texture on CUDA

The Compute Unified Device Architecture (CUDA) is a brand new parallel processing platform making use of the unified shader design of the most current Graphics Processing Units (GPUs) from NVIDIA. In this paper, we apply this revolutionary new technology to implement the sound speed optimization (SSO) with image texture analysis for medical ultrasound imaging. The […]
Sep, 24

Resolution of the Vlasov-Maxwell system by PIC Discontinuous Galerkin method on GPU with OpenCL

We present an implementation of a Vlasov-Maxwell solver for multicore processors. The Vlasov equation describes the evolution of charged particles in an electromagnetic field, solution of the Maxwell equations. The Vlasov equation is solved by a Particle-In-Cell method (PIC), while the Maxwell system is computed by a Discontinuous Galerkin method. We use the OpenCL framework, […]
Sep, 24

A Hardware-Accelerated Parallel Implementation of a Two-Dimensional Scheme for Free Surface Flows

This contribution concerns the verification and performance assessment of a hardware-accelerated parallel implementation of an algorithm for the semi-implicit finite difference method for solving the vertically integrated shallow water equations including a non-linear treatment of wetting and drying and conservative advection schemes. Instead of adapting an existing serial, OpenMP-, or MPI-parallelised code with all necessary […]
Sep, 24

ACO on Multiple GPUs with CUDA for Faster Solution of QAPs

In this paper, we implement ACO algorithms on a PC which has 4 GTX 480 GPUs. We implement two types of ACO models; the island model, and the master/slave model. When we compare the island model and the master/slave model, the island model shows promising speedup values on class (iv) QAP instances. On the other […]
Sep, 24

GPU-based Offset Surface Computation using Point Samples

We present an efficient algorithm to perform approximate offsetting operations on geometric models using GPUs. Our approach approximates the boundary of an object with point samples and computes the offset by merging the balls centered at these points. The underlying approach uses Layered Depth Images (LDI) to organize the samples into structured points and performs […]
Sep, 23

Exploring Multi-level Parallelism for Large-Scale Spiking Neural Networks

Several biologically inspired applications have been motivated by Spiking Neural Networks (SNNs) such as the Hodgkin-Huxley (HH) and Izhikevich models, owing to their high biological accuracy. The inherent massively parallel nature of the SNN simulations makes them a good fit for heterogeneous computing resources such as the General Purpose Graphical Processing Unit (GPGPU) clusters. In […]
Sep, 23

Adaptive Treelet Meshes for Efficient Streak-Surface Visualization on the GPU

We describe a novel adaptive mesh representation for streak-surfaces. The surface is represented as a mesh of small trees of initial depth zero (treelets). This mesh representation allows for efficient integration, refinement, coarsening and appending of surface patches utilizing the computational capacities of modern GPUs. Integration, refinement, and rendering are strictly separated into effectively parallelizable […]
Sep, 23

Task Performance with List-Mode Data

This dissertation investigates the application of list-mode data to detection, estimation, and image reconstruction problems, with an emphasis on emission tomography in medical imaging. We begin by introducing a theoretical framework for list-mode data and we use it to define two observers that operate on list-mode data. These observers are applied to the problem of […]
Sep, 23

Computer Vision Application in Graphic Processors

Largely driven by the gaming industry, research and development of hardware tools for the generation of images, such as graphics cards (or GPU, Graphics Processing Units), experienced a tremendous growth in recent years. The increased power and flexibility and the low price of these GPUs have resulted in unexpected use in areas other than graphics. […]
Sep, 23

A Quantitative Study of Irregular Programs on GPUs

GPUs have been used to accelerate many regular applications and, more recently, irregular applications in which the control flow and memory access patterns are data-dependent and statically unpredictable. This paper defines two measures of irregularity called control-flow irregularity and memory-access irregularity, and investigates, using performance-counter measurements, how irregular GPU kernels differ from regular kernels with […]
Sep, 22

Computing of high breakdown regression estimators without sorting on graphics processing units

We present an approach to computing high-breakdown regression estimators in parallel on graphics processing units (GPU). We show that sorting the residuals is not necessary, and it can be substituted by calculating the median. We present and compare various methods to calculate the median and order statistics on GPUs. We introduce an alternative method based […]
Sep, 22

SpMV: A Memory-Bound Application on the GPU Stuck Between a Rock and a Hard Place

In this paper, we investigate the relative merits between GPGPUs and multicores in the context of sparse matrix-vector multiplication (SpMV). While GPGPUs possess impressive capabilities in terms of raw compute throughput and memory bandwidth, their performance varies significantly with application tuning as well as sparse input and format characteristics. Furthermore, several emerging technological and workload […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: