Papers on hgpu.org (.txt-file)
Graphics Processing Unit-Based Computer-Aided Design Algorithms for Electronic Design Automation
Graphics Processing Units and Genetic Programming: An overview
Graphics Processing Units and High-Dimensional Optimization
Graphics Processing Units for Handhelds
Graphics Processing Units for the Real-time Linear Elastostatic Simulation of Liver
Graphics Processing Units in Acceleration of Bandwidth Selection for Kernel Density Estimation
Graphics Processing Units: More Than the Pathway to Realistic Video-Games
Graphics Processor Clusters for High Speed Backpropagation
Graphics Processor Unit (GPU) Acceleration of Finite-Difference Frequency-Domain (FDFD) Method
Graphics processor unit (GPU) acceleration of finite-difference time-domain (FDTD) algorithm
Graphics Programming on the Web WebCL Course Notes
Graphics Supercomputing Applied to Brain Image Analysis with NiftyReg
Graphtoy: Fast Software Simulation of Applications for AMD’s AI Engines
GraphVite: A High-Performance CPU-GPU Hybrid System for Node Embedding
GRATER: An Approximation Workflow for Exploiting Data-Level Parallelism in FPGA Acceleration
GraviDy: a GPU modular, parallel N-body integrator
Gravitational tree-code on graphics processing units: implementation in CUDA
Gravitational wave astrophysics, data analysis and multimessenger astronomy
GrAVity: a massively parallel antivirus engine
GRay: a Massively Parallel GPU-Based Code for Ray Tracing in Relativistic Spacetimes
Green AI: A Preliminary Empirical Study on Energy Consumption in DL Models Across Different Runtime Infrastructures
GreenGPU: A Holistic Approach to Energy Efficiency in GPU-CPU Heterogeneous Architectures
Grex: An efficient MapReduce framework for graphics processing units
Grid-based SAH BVH construction on a GPU
Grids, Clouds and Virtualization
grim: A Flexible, Conservative Scheme for Relativistic Fluid Theories
GRIM: A General, Real-Time Deep Learning Inference Framework for Mobile Devices based on Fine-Grained Structured Weight Sparsity
GrIP: A Framework for Experiments with Screen Space Algorithms
GRN: Gated Relation Network to Enhance Convolutional Neural Network for Named Entity Recognition
GROMACS on AMD GPU-Based HPC Platforms: Using SYCL for Performance and Portability
GROMACS on Hybrid CPU-GPU and CPU-MIC Clusters: Preliminary Porting Experiences, Results and Next Steps
GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers
GROPHECY: GPU performance projection from CPU code skeletons
Group Marching Tree: Sampling-Based Approximately Optimal Motion Planning on GPUs
Grover: Looking for Performance Improvement by Disabling Local Memory Usage in OpenCL Kernels
GRS – GPU radix sort for multifield records
gScan: Accelerating Graham Scan on the GPU
gSLIC: a real-time implementation of SLIC superpixel segmentation
gSLICr: SLIC superpixels at over 250Hz
gSMat: A Scalable Sparse Matrix-based Join for SPARQL Query Processing
GSNP: A DNA Single-Nucleotide Polymorphism Detection System with GPU Acceleration
GSParLib: A multi-level programming interface unifying OpenCL and CUDA for expressing stream and data parallelism
GStream: A General-Purpose Data Streaming Framework on GPU Clusters
gSuite: A Flexible and Framework Independent Benchmark Suite for Graph Neural Network Inference on GPUs
GT4Py: High Performance Stencils for Weather and Climate Applications using Python
Guardian: Safe GPU Sharing in Multi-Tenant Environments
GUESS-ing Polygenic Associations with Multiple Phenotypes Using a GPU-Based Evolutionary Stochastic Search Algorithm
Guided Profiling for Auto-Tuning Array Layouts on GPUs
Gunrock: A High-Performance Graph Processing Library on the GPU
Gvim: Gpu-accelerated virtual machines
Gyrofluid Modeling of Turbulent, Kinetic Physics
Gyrokinetic Particle-in-Cell Optimization on Emerging Multi- and Manycore Platforms
Gyrokinetic Toroidal Simulations on Leading Multi-and Manycore HPC Systems
gZCCL: Compression-Accelerated Collective Communication Framework for GPU Clusters
H- and C-level WFST-based large vocabulary continuous speech recognition on Graphics Processing Units
H-LU Factorization on Many-Core Systems
H. 264 Parallel Optimization on Graphics Processors
H.264/AVC motion estimation implementation on Compute Unified Device Architecture (CUDA)
HACC: Simulating Sky Surveys on State-of-the-Art Supercomputing Architectures
HAccRG: Hardware-Accelerated Data Race Detection in GPUs
Hacking Neural Networks: A Short Introduction
Hadoop Mapreduce OpenCL Plugin
Hadoop+Aparapi: Making heterogenous MapReduce programming easier
HadoopCL: MapReduce on Distributed Heterogeneous Platforms Through Seamless Integration of Hadoop and OpenCL
Hadoopcl2: Motivating the design of a distributed, heterogeneous programming system with machine-learning applications
HALF: Holistic Auto Machine Learning for FPGAs
HALO 1.0: A Hardware-agnostic Accelerator Orchestration Framework for Enabling Hardware-agnostic Programming with True Performance Portability for Heterogeneous HPC
Halo Gathering Scalability for Large Scale Multi-dimensional Sznajd Opinion Models Using Data Parallelism with GPUs
HAM – Heterogenous Active Messages for Efficient Offloading on the Intel Xeon Phi
Hand Tracking based on Hierarchical Clustering of Range Data
Handwritten Digit Recognition with a Committee of Deep Neural Nets on GPUs
HaoCL: Harnessing Large-scale Heterogeneous Processors Made Easy
HAP: SPMD DNN Training on Heterogeneous GPU Clusters with Automated Program Synthesis
Haptic and graphic rendering of deformable objects based on GPUs
Haptic feedback for the GPU-based surgical simulator
Haptic guided 3-D deformable image registration
Hard Data on Soft Errors: A Large-Scale Assessment of Real-World Error Rates in GPGPU
Hard-Sphere Collision Simulations with Multiple GPUs, PCIe Extension Buses and GPU-GPU Communications
Hardware Accelerated Molecular Docking: A Survey
Hardware accelerated multi-resolution geometry synthesis
Hardware Accelerated Skin Deformation for Animated Crowds
Hardware accelerated symmetric condensed node TLM procedure for NVIDIA graphics processing units
Hardware Acceleration for Unstructured Big Data and Natural Language Processing
Hardware Acceleration of EDA Algorithms: Custom ICs, FPGAs and GPUs
Hardware Acceleration of EDA Algorithms: GPU Architecture and the CUDA Programming Model
Hardware Acceleration of HPC Computational Flow Dynamics using HBM-enabled FPGAs
Hardware Acceleration Technologies in Computer Algebra: Challenges and Impact
Hardware acceleration vs. algorithmic acceleration: can GPU-based processing beat complexity optimization for CT?
Hardware Accelerators for Artificial Intelligence
Hardware accelerators for biocomputing: A survey
Hardware Accelerators for Cartesian Genetic Programming
Hardware and Software Optimizations for Accelerating Deep Neural Networks: Survey of Current Trends, Challenges, and the Road Ahead
Hardware Checkpointing and Productive Debugging Flows for FPGAs
Hardware Implementation and Quantization of Tiny-Yolo-v2 using OpenCL
Hardware thread reordering to boost OpenCL throughput on FPGAs
Hardware Transactional Memory for GPU Architectures
Hardware-accelerated 3D visualization of mass spectrometry data
Hardware-Accelerated Adaptive EWA Volume Splatting
Titles: 100
open PDFs: 95
packages: 21