high performance computing on graphics processing units: hgpu.org

Posts

Mar, 29

SAR raw signal simulation based on GPU parallel computation

In this paper we present a raw signal simulator based on GPU parallel computation for synthetic aperture radar. We describe a mathematical model of SAR simulation based on FFT in detail and implement it through GPU parallel computation. GPU has a better performance in complex calculation than CPU. It supports parallel computation and raises the […]

Mar, 29

An improved visual inspection system using visual servo

In this paper we present an improved automatic visual inspection system. In this system, homography based visual servo is used to accurately locate the camera position and attitude so that a template matching inspection can be realized. To improve the visual servo system’s performance, we propose a combination strategy of a GPU based Efficient Second-order […]

Mar, 29

Rendering of 3D Dynamic Virtual Environments

In this paper we present a framework for the rendering of dynamic 3D virtual environments which can be integrated in the development of videogames. It includes methods to manage sounds and particle effects, paged static geometries, the support of a physics engine and various input systems. It has been designed with a modular structure to […]

OpenGL

Mar, 29

Scandalously Parallelizable Mesh Generation

We propose a novel approach which employs random sampling to generate an accurate non-uniform mesh for numerically solving Partial Differential Equation Boundary Value Problems (PDE-BVP’s). From a uniform probability distribution U over a 1D domain, we sample M discretizations of size N where M>>N. The statistical moments of the solutions to a given BVP on […]

Mar, 29

Multi-mass solvers for lattice QCD on GPUs

Graphical Processing Units (GPUs) are more and more frequently used for lattice QCD calculations. Lattice studies often require computing the quark propagators for several masses. These systems can be solved using multi-shift inverters but these algorithms are memory intensive which limits the size of the problem that can be solved using GPUs. In this paper, […]

CUDA

Mar, 28

GPU-Based Shooting and Bouncing Ray Method for Fast RCS Prediction

The shooting and bouncing ray (SBR) method is highly effective in the radar cross section (RCS) prediction. For electrically large and complex targets, computing scattered fields is still time-consuming in many applications like range profile and ISAR simulation. In this paper, we propose a GPU-based SBR that is fully implemented on the graphics processing unit […]

CUDA

Mar, 28

An Empirically Optimized Radix Sort for GPU

In this paper, we propose an empirical optimization technique for one of the most important sorting routines on GPU, the radix sort, that generates highly efficient code for a number of representative NVIDIA GPUs with a wide variety of architectural specifications. Our study has been focused on the algorithmic parameters of radix sort that can […]

Mar, 28

GPU architecture evaluation for multispectral and hyperspectral image analysis

Graphical Processing Units (GPU) architectures are massively used for resource-intensive computation. Initially dedicated to imaging, vision and graphics, these architectures serve nowadays a wide range of multi-purpose applications. The GPU structure, however, does not suit all applications. This can lead to performance shortage. Among several applications, the aim of this work is to analyze GPU […]

Mar, 28

GPU accelerated real time polarimetric image processing through the use of CUDA

Recent advancements in semi-conductor fabrication has led to a dramatic increase in the size of data sets of advanced imaging sensors. While increased pixel counts leads to greater area coverage and higher resolution, it also results in higher image processing time. If real-time image processing is required, power and size requirements go up as large […]

CUDA

Mar, 28

GPU Based Spot Noise Parallel Algorithm for 2D Vector Field Visualization

Graphic Processing Unit (GPU) has involved into a parallel computation for it’s massively multi threaded architecture. Due to its high computational power, GPU has been used to deal with many problems that can be easily parallelized. This paper will present a GPU based spot noise parallel algorithm for 2D vector field visualization. It uses spot […]

CUDA

Mar, 28

A Chunking Method for Euclidean Distance Matrix Calculation on Large Dataset Using Multi-GPU

Calculating Euclidean distance matrix is a data intensive operation and becomes computationally prohibitive for large datasets. Recent development of Graphics Processing Units (GPUs) has produced superb performance on scientific computing problems using massive parallel processing cores. However, due to the limited size of device memory, many GPU based algorithms have low capability in solving problems […]

Mar, 28

GPU-Based Fast Minimum Spanning Tree Using Data Parallel Primitives

Minimum spanning tree is a classical problem in graph theory that plays a key role in a broad domain of applications. This paper proposes a minimum spanning tree algorithm using Prim’s approach on Nvidia GPU under CUDA architecture. By using new developed GPU-based Min-Reduction data parallel primitive in the key step of the algorithm, higher […]

CUDA

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

* * *

high performance computing on graphics processing units: hgpu.org

Posts

SAR raw signal simulation based on GPU parallel computation

An improved visual inspection system using visual servo

Rendering of 3D Dynamic Virtual Environments

Scandalously Parallelizable Mesh Generation

Multi-mass solvers for lattice QCD on GPUs

GPU-Based Shooting and Bouncing Ray Method for Fast RCS Prediction

An Empirically Optimized Radix Sort for GPU

GPU architecture evaluation for multispectral and hyperspectral image analysis

GPU accelerated real time polarimetric image processing through the use of CUDA

GPU Based Spot Noise Parallel Algorithm for 2D Vector Field Visualization

A Chunking Method for Euclidean Distance Matrix Calculation on Large Dataset Using Multi-GPU

GPU-Based Fast Minimum Spanning Tree Using Data Parallel Primitives

Recent source codes

QArray

Celerity: High-level C++ for Accelerator Clusters

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

OpenMC Monte Carlo Code

Polygeist: C/C++ frontend for MLIR

Parallel Gaussian process with kernel approximation in CUDA

Optical flow algorithms for SYCL

OpenMP5-Offload-OpenMC-Intel-PVC

Most viewed papers (last 30 days)