high performance computing on graphics processing units: hgpu.org

Posts

Mar, 24

Digital beamforming using a GPU

In this paper we investigate the use of GPUs as digital beamformers. We specify a parallel implementation of a beamformer in time and frequency domain and measure its performance. We also give examples of the processing limits of NVIDIA Geforce 8800 GPU with respect to application parameters: number of sensors, sampling frequency, bandwidth, and number […]

CUDA

Mar, 24

A realtime GPU subdivision kernel

By organizing the control mesh of subdivision in texture memory so that irregularities occur strictly inside independently refinable fragment meshes, all major features of subdivision algorithms can be realized in the framework of highly parallel stream processing. Our implementation of Catmull-Clark subdivision as a GPU kernel in programmable graphics hardware can model features like semi-smooth […]

OpenGL

Mar, 24

General-purpose GPU computing: practice and experience

This workshop will cover advances and innovations in graphic processor unit (GPU) capabilities and functionality into nontraditional, general-purpose computing as an adjunct vector/matrix processor. Examples include game physics, image processing, scientific computing, sorting and database query processing, to name a few.This workshop consists of invited speakers and poster presenters who provide insights into GP2 practice […]

Mar, 24

Accelerating Simulations of Light Scattering Based on Finite-Difference Time-Domain Method with General Purpose GPUs

Simulations of light scattering from nano-structured surface areas require substantial amount of computing time. The emergence of General Purpose Graphics Processing Units (GPGPUs) as affordable PC SIMD arithmetic coprocessors brings the necessary computing power to modern desktop PCs. In this paper we examine how the computation time of the Finite-Difference Time-Domain (FDTD), a classic numerical […]

CUDA

Mar, 24

Fast in-place sorting with CUDA based on bitonic sort

State of the art graphics processors provide high processing power and furthermore, the high programmability of GPUs offered by frameworks like CUDA increases their usability as high-performance coprocessors for general-purpose computing. Sorting is well-investigated in Computer Science in general, but (because of this new field of application for GPUs) there is a demand for high-performance […]

CUDA

Mar, 24

A GPU approach to FDTD for Radio Coverage Prediction

The benefits of using Finite-Difference alike methods for coverage prediction comprise highly accurate electromagnetic simulations that serve as a reliable input for wireless networks planning and optimization algorithms. These algorithms usually require several thousands of iterations in order to find the optimal network configuration, so to obtain results within reasonable computation times, the applied propagation […]

CUDA

Mar, 24

A Novel Scheme for High Performance Finite-Difference Time-Domain (FDTD) Computations Based on GPU

Finite-Difference Time-Domain (FDTD) has been proved to be a very useful computational electromagnetic algorithm. However, the scheme based on traditional general purpose processors can be computationally prohibitive and require thousands of CPU hours, which hinders the large-scale application of FDTD. With rapid progress on GPU hardware capability and its programmability, we propose in this paper […]

Mar, 23

Modeling GPU-CPU Workloads and Systems

Heterogeneous systems, systems with multiple processors tailored for specialized tasks, are challenging programming environments. While it may be possible for domain experts to optimize a high performance application for a very specific and well documented system, it may not perform as well or even function on a different system. Developers who have less experience with […]

Mar, 23

Gvim: Gpu-accelerated virtual machines

The use of virtualization to abstract underlying hardware can aid in sharing such resources and in efficiently managing their use by high performance applications. Unfortunately, virtualization also prevents efficient access to accelerators, such as Graphics Processing Units (GPUs), that have become critical components in the design and architecture of HPC systems. Supporting General Purpose computing […]

CUDA

Mar, 23

GPU-Assisted Computation of Centroidal Voronoi Tessellation

Centroidal Voronoi tessellations (CVT) are widely used in computational science and engineering. The most commonly used method is Lloyds method, and recently the L-BFGS method is shown to be faster than Lloyds method for computing the CVT. However, these methods run on the CPU and are still too slow for many practical applications. We present […]

Mar, 23

GPU Random Numbers via the Tiny Encryption Algorithm

Random numbers are extensively used on the GPU. As more computation is ported to the GPU, it can no longer be treated as rendering hardware alone. Random number generators (RNG) are expected to cater general purpose and graphics applications alike. Such diversity adds to expected requirements of a RNG. A good GPU RNG should be […]

CUDA

Mar, 23

Aspects of GPU for general purpose high performance computing

We discuss hardware and software aspects of GPGPU, specifically focusing on NVIDIA cards and CUDA, from the viewpoints of parallel computing. The major weak points of GPU against newest supercomputers are identified to be and summarized as only four points: large SIMD vector length, small memory, absence of fast L2 cache, and high register spill […]

CUDA

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

* * *

high performance computing on graphics processing units: hgpu.org

Posts

Digital beamforming using a GPU

A realtime GPU subdivision kernel

General-purpose GPU computing: practice and experience

Accelerating Simulations of Light Scattering Based on Finite-Difference Time-Domain Method with General Purpose GPUs

Fast in-place sorting with CUDA based on bitonic sort

A GPU approach to FDTD for Radio Coverage Prediction

A Novel Scheme for High Performance Finite-Difference Time-Domain (FDTD) Computations Based on GPU

Modeling GPU-CPU Workloads and Systems

Gvim: Gpu-accelerated virtual machines

GPU-Assisted Computation of Centroidal Voronoi Tessellation

GPU Random Numbers via the Tiny Encryption Algorithm

Aspects of GPU for general purpose high performance computing

Recent source codes

QArray

Celerity: High-level C++ for Accelerator Clusters

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

OpenMC Monte Carlo Code

Polygeist: C/C++ frontend for MLIR

Parallel Gaussian process with kernel approximation in CUDA

Optical flow algorithms for SYCL

OpenMP5-Offload-OpenMC-Intel-PVC

Most viewed papers (last 30 days)