Papers on hgpu.org (.txt-file)
PyCOOL – a Cosmological Object-Oriented Lattice code written in Python
PyCUDA and PyOpenCL: A Scripting-Based Approach to GPU Run-Time Code Generation
PyCUDA: GPU Run-Time Code Generation for High-Performance Computing
PyFAI, a versatile library for azimuthal regrouping
PyFAI: a Python library for high performance azimuthal integration on GPU
PyFR: An Open Source Framework for Solving Advection-Diffusion Type Problems on Streaming Architectures using the Flux Reconstruction Approach
PyGraph: Robust Compiler Support for CUDA Graphs in PyTorch
pyGSL: A Graph Structure Learning Toolkit
pyJac: analytical Jacobian generator for chemical kinetics
PyMatting: A Python Library for Alpha Matting
pyMIC: A Python Offload Module for the Intel Xeon Phi Coprocessor
PyOMP: Parallel programming for CPUs and GPUs with OpenMP and Python
pyPaSWAS: Python-based multi-core CPU and GPU sequence alignment
PyPs, a programmable pass manager
Pyramid Methods in GPU-Based Image Processing
Pyramidal Image Blending Using CUDA Framework
PySAGES: flexible, advanced sampling methods accelerated with GPUs
PySchedCL: Leveraging Concurrency in Heterogeneous Data-Parallel Systems
PySPH: A Python framework for SPH
PySPH: a Python-based framework for smoothed particle hydrodynamics
Python for Development of OpenMP and CUDA Kernels for Multidimensional Data
Python Non-Uniform Fast Fourier Transform (PyNUFFT): An Accelerated Non-Cartesian MRI Package on a Heterogeneous Platform (CPU/GPU)
Python Workflows on HPC Systems
Python-Based Quantum Chemistry Calculations with GPU Acceleration
PyTorch Hyperparameter Tuning – A Tutorial for spotPython
PyTorch: An Imperative Style, High-Performance Deep Learning Library
PyTorchPipe: a framework for rapid prototyping of pipelines combining language and vision
PyTransit: Fast and Easy Exoplanet Transit Modelling in Python
q-state Potts model metastability study using optimized GPU-based Monte Carlo algorithms
QArray: a GPU-accelerated constant capacitance model simulator for large quantum dot arrays
QCD on GPUs: cost effective supercomputing
QCD simulations with staggered fermions on GPUs
QCDGPU: open-source package for Monte Carlo lattice simulations on OpenCL-compatible multi-GPU systems
qecGPT: decoding Quantum Error-correcting Codes with Generative Pre-trained Transformers
QGTC: Accelerating Quantized GNN via GPU Tensor Core
Qilin: exploiting parallelism on heterogeneous multiprocessors with adaptive mapping
QMCPACK: An open source ab initio Quantum Monte Carlo package for the electronic structure of atoms, molecules, and solids
QP: A Heterogeneous Multi-Accelerator Cluster
QPACE 2 and Domain Decomposition on the Intel Xeon Phi
QR Factorization on a Multicore Node Enhanced with Multiple GPU Accelerators
QSL Squasher: A Fast Quasi-Separatrix Layer Map Calculator
Quadratic Pseudo-Boolean Optimization for Scene Analysis using CUDA
Qualcomm Snapdragon Mobile Platform OpenCL General Programming and Optimization
Quality comparison and acceleration for digital hologram generation method based on segmentation
Quality-score guided error correction for short-read sequencing data using CUDA
Quantifying NUMA and contention effects in multi-GPU systems
Quantifying OpenMP: Statistical Insights into Usage and Adoption
Quantifying the Energy Efficiency of FFT on Heterogeneous Platforms
Quantifying the Energy Efficiency of Object Recognition and Optical Flow
Quantifying the Impact of GPUs on Performance and Energy Efficiency in HPC Clusters
Quantile Mechanics II: Changes of Variables in Monte Carlo methods and a GPU-Optimized Normal Quantile
Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations
Quantum Boolean Image Denoising
Quantum chemical many-body theory on heterogeneous nodes
Quantum Chemistry for Solvated Molecules on Graphical Processing Units (GPUs) using Polarizable Continuum Models
Quantum Chemistry on Graphical Processing Units. 1. Strategies for Two-Electron Integral Evaluation
Quantum Chemistry on Graphical Processing Units. 2. Direct Self-Consistent-Field Implementation
Quantum Chemistry on Graphical Processing Units. 3. Analytical Energy Gradients, Geometry Optimization, and First Principles Molecular Dynamics
Quantum computer simulation using the CUDA programming model
Quantum Monte Carlo on graphical processing units
Quantum.Ligand.Dock: protein-ligand docking with quantum entanglement refinement on a GPU system
Quartile and Outlier Detection on Heterogeneous Clusters Using Distributed Radix Sort
Quasars spectra classification with the help of GPU computing
Quasi-maximum Accuracy Floating-point Computations with GPGPU for Applications in Digital Signal Processing
Quasi-real-time analysis of dynamic near field scattering data using a graphics processing unit
QUDA programming for staggered quarks
Query Optimization in Heterogeneous CPU/GPU Environment for Time Series Databases
Query Processing on Tensor Computation Runtimes
Query-Driven Visualization of Time-Varying Adaptive Mesh Refinement Data
Quick-CULLIDE: fast inter- and intra-object collision culling using graphics hardware
QUICK: Quantization-aware Interleaving and Conflict-free Kernel for efficient LLM inference
QuickProbs – A Fast Multiple Sequence Alignment Algorithm Designed for Graphics Processors
Quine-McCluskey algorithm on GPGPU
QYMSYM: A GPU-Accelerated Hybrid Symplectic Integrator That Permits Close Encounters
R2GUESS: A Graphics Processing Unit-Based R Package for Bayesian Variable Selection Regression of Multivariate Responses
Radeon PRO Solid State Graphics (SSG) API User Manual
Radial Basis Function Networks GPU-Based Implementation
Radiation Modeling Using the Uintah Heterogeneous CPU/GPU Runtime System
Radiative Heat Transfer Simulation Using Programmable Graphics Hardware
Radio astronomy beam forming on GPUs
Radio Astronomy Beam Forming on Many-Core Architectures
Radiometric Compensation through Inverse Light Transport
Radionuclides migration modelling using artificial neural networks and parallel computing
RadixBoost: A Hardware Acceleration Structure for Scalable Radix Sort on Graphic Processors
Rain Scene Animation through Particle Systems and Surface Flow Simulation by SPH
Raising the Bar for Using GPUs in Software Packet Processing
Raising the level of many-core programming with compiler technology: meeting a grand challenge
Raising the Performance of the Tinker-HP Molecular Modeling Package on Intel’s HPC Architectures: a Living Review [Article v1.0]
Random Address Permute-Shift Technique for the Shared Memory on GPUs
Random Fields Generation on the GPU with the Spectral Turning Bands Method
Random Finite Set Based Bayesian Filtering with OpenCL in a Heterogeneous Platform
Random Forests of Very Fast Decision Trees on GPU for Mining Evolving Big Data Streams
Random number generators for massively parallel simulations on GPU
Random Walks based Multi-Image Segmentation: Quasiconvexity Results and GPU-based Solutions
Random Walks for Image Cosegmentation
Random Walks for Interactive Organ Segmentation in Two and Three Dimensions: Implementation and Validation
Random-access rendering of general vector graphics
Randomized selection on the GPU
Range Cell Migration Correction using texture mapping on GPU
Titles: 100
open PDFs: 94
packages: 41