high performance computing on graphics processing units: hgpu.org

Posts

Nov, 14

Multi GPU Implementation of the Simplex Algorithm

The Simplex algorithm is a well known method to solve linear programming (LP) problems. In this paper, we propose an implementation via CUDA of the Simplex method on a multi GPU architecture. Computational tests have been carried out on randomly generated instances for non-sparse LP problems. The tests show a maximum speedup of 24:5 with […]

CUDA

Nov, 14

GPU-accelerated power pattern synthesis of aperiodic linear arrays

We deal with the development of a computationally effective approach for the synthesis of equivalently tapered, aperiodic linear arrays, i.e. arrays matching the requirements on the power pattern by acting only on the element positions and excitation phases. The computational effectiveness of the algorithm is reached by the development of a parallel Non Uniform Fast […]

CUDA

Nov, 14

AVSS2011 demo session: GPU enabled Smart Video Node

This paper presents an All-in-One video analytics system, a compact, multi-channel, real-time, video monitoring, event detection, alarm notification, event recording and browsing solution implemented on low cost hardware, taking advantage of NVIDIA’s GPU CUDA platform. An inventive distribution of video object detection and tracking processing chain between the GPUs and the CPU provides maximum efficiency […]

CUDA

Nov, 14

Seismic Wave Propagation Simulation Using Accelerated Support Operator Rupture Dynamics on Multi-GPU

The Support Operator Method (SOM) is a numerical method based on finite difference method. The Support Operator Rupture Dynamics (SORD) is an application based on it. It can be used in simulation of 3D elastic wave propagation and spontaneous rupture on hexahedral mesh. It can be applied to various surface boundary conditions. The original application […]

CUDA

Nov, 14

Fast RCS prediction using multiresolution shooting and bouncing ray method on the GPU

This paper presents a GPU-based multiresolution shooting and bouncing ray (MSBR) method with the kd-tree acceleration structure for the fast radar cross section (RCS) prediction of electrically large and complex targets. The multiresolution grid algorithm can greatly reduce the total number of ray tubes, as it adaptively adjusts the density of ray tubes for regions […]

CUDA

Nov, 14

Efficient Implementation of the Simplex Method on a CPU-GPU System

The Simplex algorithm is a well known method to solve linear programming (LP) problems. In this paper, we propose a parallel implementation of the Simplex on a CPU-GPU systems via CUDA. Double precision implementation is used in order to improve the quality of solutions. Computational tests have been carried out on randomly generated instances for […]

CUDA

Nov, 14

A fast and robust seed flooding algorithm on GPU for Voronoi diagram generation

Voronoi diagram(VD) is a fundamental data structure in computational geometry. With the rapid development of programmable graphics programmable units, utilizing GPU to construct VD has been an optimal strategy. Considering the bridles of state-of-art algorithms, a seed flooding algorithm(SFA) is presented to achieve both robustness and high performance. The experimental results shows that SFA can […]

CUDA

Nov, 14

B-CALM: An open-source GPU-based 3D-FDTD with multi-pole dispersion for plasmonics

Numerical calculations with finite-difference time-domain (FDTD) on metallic nanostructures in a broad optical spectrum require an accurate approximation of the permittivity of dispersive materials. Here, we present the algorithms behind B-CALM (Belgium-California Light Machine), an open-source 3D-FDTD solver operating on Graphical Processing Units (GPU’s) with multi-pole dispersion models. Our modified architecture shows a reduction in […]

Nov, 14

GPU Based Tissue Doppler Imaging

Tissue Doppler imaging is a routinely used diagnostic tool for assessing myocardial function in real time. The required signal processing is computationally intensive, including modified auto-correlation, scan conversion, image mapping. Parallel algorithms and implementations based on GPU platform are proposed in this paper to increase the computation efficiency. The experimental signal data is acquired from […]

CUDA

Nov, 14

Extinction-Based Shading and Illumination in GPU Volume Ray-Casting

Direct volume rendering has become a popular method for visualizing volumetric datasets. Even though computers are continually getting faster, it remains a challenge to incorporate sophisticated illumination models into direct volume rendering while maintaining interactive frame rates. In this paper, we present a novel approach for advanced illumination in direct volume rendering based on GPU […]

CUDA

Nov, 13

Comprehensive Performance Monitoring for GPU Cluster Systems

Accelerating applications with GPUs has recently garnered a lot of interest from the scientific computing community. While tools for optimizing individual kernels are readily available, there is a lack of support for the specific needs of the HPC area. Most importantly, integration with existing parallel programming models (MPI and threading) and scalability to the full […]

CUDA

Nov, 13

Advantages and GPU implementation of high-performance indexed DNA search based on suffix arrays

A comparative analysis of high-performance implementations of two state of the art index structures that are of particular interest in the field of bioinformatics applications to accelerate the alignment of DNA sequences is presented. The two indexes are based on suffix trees and suffix arrays and were implemented in two different platforms: a quad-core CPU […]

CUDA

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

* * *

high performance computing on graphics processing units: hgpu.org

Posts

Multi GPU Implementation of the Simplex Algorithm

GPU-accelerated power pattern synthesis of aperiodic linear arrays

AVSS2011 demo session: GPU enabled Smart Video Node

Seismic Wave Propagation Simulation Using Accelerated Support Operator Rupture Dynamics on Multi-GPU

Fast RCS prediction using multiresolution shooting and bouncing ray method on the GPU

Efficient Implementation of the Simplex Method on a CPU-GPU System

A fast and robust seed flooding algorithm on GPU for Voronoi diagram generation

B-CALM: An open-source GPU-based 3D-FDTD with multi-pole dispersion for plasmonics

GPU Based Tissue Doppler Imaging

Extinction-Based Shading and Illumination in GPU Volume Ray-Casting

Comprehensive Performance Monitoring for GPU Cluster Systems

Advantages and GPU implementation of high-performance indexed DNA search based on suffix arrays

Recent source codes

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

SYCL Container

Most viewed papers (last 30 days)