7400

Posts

Mar, 19

GPU Enhanced Simulation of Angiogenesis

In the paper we present the use of graphic processor units to accelerate the most time-consuming stages of a simulation of angiogenesis and tumor growth. By the use of advanced CUDA mechanisms such as shared memory, textures and atomic operations, we managed to speed up the CUDA kernels by a factor of 57x. However, in […]
Mar, 19

Parallelization of Particle Filter Algorithms

This paper presents the parallelization of the particle filter algorithm in a single target video tracking application. In this document we demonstrate the process by which we parallelized the particle filter algorithm, beginning with a MATLAB implementation. The final CUDA program provided approximately 71x speedup over the initial MATLAB implementation.
Mar, 19

Computational Intelligence on Consumer Games and Graphics Hardware CIGPU-2012

The fifth International workshop and tutorial on Computational Intelligence on Consumer Games and Graphics Hardware (CIGPU 2012) will be held as a hybrid special session of the IEEE WCCI 2012 conference in Brisbane, 10-15 June 2012. WCCI 2012, the IEEE world congress on computational intelligence, joins together three international conferences: IJCNN 2012, FUZZ-IEEE 2012 and […]
Mar, 18

Improving Cache Locality for Ray Casting with CUDA

In this paper, we present an acceleration method for texture-based ray casting on the compute unified device architecture (CUDA) compatible graphics processing unit (GPU). Since ray casting is a memory-intensive application, our method increases the hit rate of the texture cache during rendering. To achieve this, our method dynamically selects the width and height of […]
Mar, 18

Towards user transparent parallel multimedia computing on GPU-clusters

The research area of Multimedia Content Analysis (MMCA) considers all aspects of the automated extraction of knowledge from multimedia archives and data streams. To satisfy the increasing computational demands of MMCA problems, the use of High Performance Computing (HPC) techniques is essential. As most MMCA researchers are not HPC experts, there is an urgent need […]
Mar, 18

Enabling Fast, Noncontiguous GPU Data Movement in Hybrid MPI+GPU Environments

Lack of efficient and transparent interaction with GPU data in hybrid MPI GPU environments challenges GPU acceleration of largescale scientific and engineering computations. A particular challenge is the efficient transfer of noncontiguous data to and from GPU memory. MPI supports such transfers through the use of datatypes, however an efficient means of utilizing datatypes for […]
Mar, 18

Usable assembly language for GPUs: a success story

The NVIDIA compilers nvcc and ptxas leave the programmer with only very limited control over register allocation, register spills, instruction selection, and instruction scheduling. In theory a programmer can gain control by writing an entire kernel in van der Laan’s cudasm assembly language, but this requires tedious, error-prone tracking of register assignments. This paper introduces […]
Mar, 18

VOCL: An Optimized Environment for Transparent Virtualization of Graphics Processing Units

Graphics processing units (GPUs) have been widely used for general purpose computation acceleration. However, current programming models such as CUDA and OpenCL can support GPUs only on the local computing node, where the application execution is tightly coupled to the physical GPU hardware. In this work, we propose a virtual OpenCL (VOCL) framework to support […]
Mar, 16

Globally scheduled real-time multiprocessor systems with GPUs

Graphics processing units, GPUs, are powerful processors that can offer significant performance advantages over traditional CPUs. The last decade has seen rapid advancement in GPU computational power and generality. Recent technologies make it possible to use GPUs as co-processors to CPUs. The performance advantages of GPUs can be great, often outperforming traditional CPUs by orders […]
Mar, 16

CUDA 2D Stencil Computations for the Jacobi Method

We are witnessing the consolidation of the GPUs streaming paradigm in parallel computing. This paper explores stencil operations in CUDA to optimize on GPUs the Jacobi method for solving Laplace’s differential equation. The code keeps constant the access pattern through a large number of loop iterations, that way being representative of a wide set of […]
Mar, 16

Parallel Sparse Linear Algebra for Multi-core and Many-core Platforms: Parallel Solvers and Preconditioners

Partial differential equations are typically solved by means of finite difference, finite volume or finite element methods resulting in large, highly coupled, ill-conditioned and sparse (non-)linear systems. In order to minimize the computing time we want to exploit the capabilities of modern parallel architectures. The rapid hardware shifts from single core to multi-core and many-core […]
Mar, 16

Developing a CUDA solver for large sparse matrices for MARIN

This masters thesis has been written for the degree of Master of Science in Applied Mathematics at the faculty of Electrical Engineering, Mathematics and Computer Sciences of Delft University of Technology. The report ends a nine month internship carried out at Maritime Research Institute Netherlands (MARIN). MARIN supplies innovative products for the offshore industry and […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org