
Nov, 4

Benchmarking GPUs to tune dense linear algebra

We present performance results for dense linear algebra using recent NVIDIA GPUs. Our matrix-matrix multiply routine (GEMM) runs up to 60% faster than the vendor’s implementation and approaches the peak of hardware capabilities. Our LU, QR and Cholesky factorizations achieve up to 80–90% of the peak GEMM rate. Our parallel LU running on two GPUs […]
Nov, 4

Accelerated regular grid traversals using extended anisotropic chessboard distance fields on a parallel stream processor

Modern graphics processing units (GPUs) are an implementation of parallel stream processors. In recent years, there have been a few studies on mapping ray tracing to the GPU. Since graphics processors are not designed to process complex data structures, it is crucial to explore data structures and algorithms for efficient stream processing. In particular ray […]
Nov, 4

Feature tracking and matching in video using programmable graphics hardware

This paper describes novel implementations of the KLT feature tracking and SIFT feature extraction algorithms that run on the graphics processing unit (GPU) and is suitable for video analysis in real-time vision systems. While significant acceleration over standard CPU implementations is obtained by exploiting parallelism provided by modern programmable graphics hardware, the CPU is freed […]
Nov, 4

Multilevel Multidimensional Scaling on the GPU

We present Glimmer, a new multilevel visualization algorithm for multidimen-sional scaling designed to exploit modern graphics processing unit (GPU) hard-ware. We also present GPU-SF, a parallel, force-based subsystem used by Glim-mer. Glimmer organizes input into a hierarchy of levels and recursively applies GPU-SF to combine and refine the levels. The multilevel nature of the algorithm […]
Nov, 4

High performance volume splatting for visualization of neurovascular data

A new technique is presented to increase the performance of volume splatting by using hardware accelerated point sprites. This allows creating screen aligned elliptical splats for high quality volume splatting at very low cost on the GPU. Only one vertex per splat is stored on the graphics card. GPU generated point sprite texture coordinates are […]
Nov, 4

Fast multipole methods on graphics processors

The fast multipole method allows the rapid approximate evaluation of sums of radial basis functions. For a specified accuracy, @e, the method scales as O(N) in both time and memory compared to the direct method with complexity O(N^2), which allows the solution of larger problems with given resources. Graphical processing units (GPU) are now increasingly […]
Nov, 4

Solving Dense Linear Systems on Graphics Processors

We present several algorithms to compute the solution of a linear system of equations on a GPU, as well as general techniques to improve their performance, such as padding and hybrid GPU-CPU computation. We also show how iterative refinement with mixed-precision can be used to regain full accuracy in the solution of linear systems. Experimental […]
Nov, 4

Parallel Computing Experiences with CUDA

The CUDA programming model provides a straightforward means of describing inherently parallel computations, and NVIDIA’s Tesla GPU architecture delivers high computational throughput on massively parallel problems. This article surveys experiences gained in applying CUDA to a diverse set of problems and the parallel speedups over sequential codes running on traditional CPU architectures attained by executing […]
Nov, 4

Issues and challenges in compiling for graphics processors

Graphics has been one of the best success stories of parallel processing. Using a unique combination of specialized hardware and aspecialized programming model, game developers routinely write high performance code using millions of threads. Each Generation of graphic processors (GPU’s) delivers higher performance and is more programmable then the last. Unlike CPU’s, these processors are […]
Nov, 4

GPUCV: A Framework for Image Processing Acceleration with Graphics Processors

This paper presents a state of the art report on using graphics hardware for image processing and computer vision. Then we describe GPUCV, an open library for easily developing GPU accelerated image processing and analysis operators and applications
Nov, 4

GAMER with out-of-core computation

GAMER is a GPU-accelerated Adaptive-MEsh-Refinement code for astrophysical simulations. In this work, two further extensions of the code are reported. First, we have implemented the MUSCL-Hancock method with the Roe’s Riemann solver for the hydrodynamic evolution, by which the accuracy, overall performance and the GPU versus CPU speed-up factor are improved. Second, we have implemented […]
Nov, 4

Relational joins on graphics processors

We present a novel design and implementation of relational join algorithms for new-generation graphics processing units (GPUs). The most recent GPU features include support for writing to random memory locations, efficient inter-processor communication, and a programming model for general-purpose computing. Taking advantage of these new features, we design a set of data-parallel primitives such as […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: