Papers on hgpu.org (.txt-file)
gSuite: A Flexible and Framework Independent Benchmark Suite for Graph Neural Network Inference on GPUs

GT4Py: High Performance Stencils for Weather and Climate Applications using Python

Guardian: Safe GPU Sharing in Multi-Tenant Environments

GUESS-ing Polygenic Associations with Multiple Phenotypes Using a GPU-Based Evolutionary Stochastic Search Algorithm

Guided Profiling for Auto-Tuning Array Layouts on GPUs

Gunrock: A High-Performance Graph Processing Library on the GPU

Gvim: Gpu-accelerated virtual machines

Gyrofluid Modeling of Turbulent, Kinetic Physics

Gyrokinetic Particle-in-Cell Optimization on Emerging Multi- and Manycore Platforms
Gyrokinetic Toroidal Simulations on Leading Multi-and Manycore HPC Systems

gZCCL: Compression-Accelerated Collective Communication Framework for GPU Clusters

H- and C-level WFST-based large vocabulary continuous speech recognition on Graphics Processing Units

H-LU Factorization on Many-Core Systems

H. 264 Parallel Optimization on Graphics Processors

H.264/AVC motion estimation implementation on Compute Unified Device Architecture (CUDA)

HACC: Simulating Sky Surveys on State-of-the-Art Supercomputing Architectures

HAccRG: Hardware-Accelerated Data Race Detection in GPUs

Hacking Neural Networks: A Short Introduction

Hadoop Mapreduce OpenCL Plugin

Hadoop+Aparapi: Making heterogenous MapReduce programming easier

HadoopCL: MapReduce on Distributed Heterogeneous Platforms Through Seamless Integration of Hadoop and OpenCL

Hadoopcl2: Motivating the design of a distributed, heterogeneous programming system with machine-learning applications

HALF: Holistic Auto Machine Learning for FPGAs

HALO 1.0: A Hardware-agnostic Accelerator Orchestration Framework for Enabling Hardware-agnostic Programming with True Performance Portability for Heterogeneous HPC

Halo Gathering Scalability for Large Scale Multi-dimensional Sznajd Opinion Models Using Data Parallelism with GPUs

HAM – Heterogenous Active Messages for Efficient Offloading on the Intel Xeon Phi

Hand Tracking based on Hierarchical Clustering of Range Data

Handwritten Digit Recognition with a Committee of Deep Neural Nets on GPUs

HaoCL: Harnessing Large-scale Heterogeneous Processors Made Easy

HAP: SPMD DNN Training on Heterogeneous GPU Clusters with Automated Program Synthesis

Haptic and graphic rendering of deformable objects based on GPUs

Haptic feedback for the GPU-based surgical simulator
Haptic guided 3-D deformable image registration

Hard Data on Soft Errors: A Large-Scale Assessment of Real-World Error Rates in GPGPU

Hard-Sphere Collision Simulations with Multiple GPUs, PCIe Extension Buses and GPU-GPU Communications

Hardware Accelerated Molecular Docking: A Survey

Hardware accelerated multi-resolution geometry synthesis

Hardware Accelerated Skin Deformation for Animated Crowds

Hardware accelerated symmetric condensed node TLM procedure for NVIDIA graphics processing units

Hardware Acceleration for Unstructured Big Data and Natural Language Processing

Hardware Acceleration of EDA Algorithms: Custom ICs, FPGAs and GPUs

Hardware Acceleration of EDA Algorithms: GPU Architecture and the CUDA Programming Model

Hardware Acceleration of HPC Computational Flow Dynamics using HBM-enabled FPGAs

Hardware Acceleration Technologies in Computer Algebra: Challenges and Impact

Hardware acceleration vs. algorithmic acceleration: can GPU-based processing beat complexity optimization for CT?

Hardware Accelerators for Artificial Intelligence

Hardware accelerators for biocomputing: A survey

Hardware Accelerators for Cartesian Genetic Programming

Hardware and Software Optimizations for Accelerating Deep Neural Networks: Survey of Current Trends, Challenges, and the Road Ahead

Hardware Checkpointing and Productive Debugging Flows for FPGAs

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

Hardware Implementation and Quantization of Tiny-Yolo-v2 using OpenCL

Hardware thread reordering to boost OpenCL throughput on FPGAs

Hardware Transactional Memory for GPU Architectures

Hardware-accelerated 3D visualization of mass spectrometry data
Hardware-Accelerated Adaptive EWA Volume Splatting

Hardware-accelerated parallel non-photorealistic volume rendering

Hardware-Accelerated Raycasting: Towards an Effective Brain MRI Visualization

Hardware-Accelerated Volume Rendering for Real-Time Medical Data Visualization
Hardware-assisted feature analysis and visualization of procedurally encoded multifield volumetric data

Hardware-Assisted High-Efficiency Ray Casting of Unstructured Time-Varying Flows Using Temporal Coherence

Hardware-Assisted Projected Tetrahedra

Hardware-assisted Rendering of CSG Models

Hardware-Assisted Software Testing and Debugging for Heterogeneous Computing

Hardware-assisted visibility sorting for unstructured volume rendering

Hardware-based nonlinear filtering and segmentation using high-level shading languages

Hardware-based simulation and collision detection for large particle systems

Hardware-Efficient Belief Propagation

Hardware-Oblivious Parallelism for In-Memory Column-Stores

Hardware-Oriented Multigrid Finite Element Solvers on GPU-Accelerated Clusters

Hardware/Software Co-Design for Data-Intensive Genomics Workloads

Hardware/Software Co-design for Energy-Efficient Seismic Modeling

Hardware/Software Vectorization for Closeness Centrality on Multi-/Many-Core Architectures

Harmonic CUDA: Asynchronous Programming on GPUs

Harnessing Aspect Oriented Programming on GPU: Application to Warp-Level Parallelism (WLP)

Harnessing Batched BLAS/LAPACK Kernels on GPUs for Parallel Solutions of Block Tridiagonal Systems

Harnessing GPU Computing in System-Level Software

Harnessing Integrated CPU-GPU System Memory for HPC: a first look into Grace Hopper

Harnessing the GPU for Real-Time Haptic Tissue Simulation

Harnessing the Power of GPUs without Losing Abstractions in SaC and ArrayOL: A Comparative Study

Harnessing the power of idle GPUs for acceleration of biological sequence alignment

Harvesting graphics power for MD simulations

Hash-Based Authentication Revisited in the Age of High-Performance Computers

HashGraph – Scalable Hash Tables Using A Sparse Graph Data Structure

Hashing, Caching, and Synchronization: Memory Techniques for Latency Masking Multithreaded Applications

Hauberk: Lightweight Silent Data Corruption Error Detector for GPGPU

Have GPUs made FPGAs redundant in the field of video processing?
HCudaBLAST: an implementation of BLAST on Hadoop and Cuda

HCW 2009 keynote talk: GPU computing: Heterogeneous computing for future systems

HDArray: Parallel Array Interface for Distributed Heterogeneous Devices

Head Pose Tracking Using GPU Based Real-time 3D Registration

Heat Load Modelling for District Heating Plants Using an OpenCL-based Algorithm

HEATS: Heterogeneity- and Energy-Aware Task-based Scheduling

HELIOS-K: An Ultrafast, Open-source Opacity Calculator for Radiative Transfer

Helix: Distributed Serving of Large Language Models via Max-Flow on Heterogeneous GPUs

Hera-JVM: a runtime system for heterogeneous multi-core architectures

Hercules: A Compiler for Productive Programming of Heterogeneous Systems

Titles: 100
open PDFs: 94
packages: 17
