Papers on hgpu.org (.txt-file)
Programming in CUDA for Kepler and Maxwell Architecture
Programming issues for video analysis on Graphics Processing Units
Programming Massively Parallel Architectures using MARTE: a Case Study
Programming massively parallel processors : A Hands – on approach
Programming Massively Parallel Processors with CUDA (audio course)
Programming model for a heterogeneous x86 platform
Programming Models and Runtimes for Heterogeneous Systems
Programming Models and Scheduling Techniques for Heterogeneous Architectures
Programming Models and Tools for Many-Core Platforms
Programming NVIDIA cards by means of transitive closure based parallelization algorithms
Programming of shared memory GPUs shared memory systems
Programming on Parallel Machines: GPU, Multicore, Clusters and More
Programming video cards for computational electromagnetics applications
Programming with Explicit Dependencies. A Framework for Portable Parallel Programming
Programming-Model Centric Debugging for Multicore Embedded Systems
Progressive Clustering of Big Data with GPU Acceleration and Visualization
Progressive High-Quality Response Surfaces for Visually Guided Sensitivity Analysis
Progressive Photon Mapping on GPUs
Progressive Semantic Segmentation
Projected tetrahedra revisited: a barycentric formulation applied to digital radiograph reconstruction using higher-order attenuation functions
Projectile Monte-Carlo Trajectory Analysis Using a Graphics Processing Unit
Projecting Tetrahedra with a Simplified Basis Graph
PROJECTION Algorithm for Motif Finding on GPUs
Promise of embedded system with GPU in artificial leg control: Enabling time-frequency feature extraction from electromyography
Proposition for propagated occupation grids for non-rigid moving objects tracking
Prospects for scalable 3D FFTs on heterogeneous exascale systems
Prospects of GPGPU in the Auger Offline Software Framework
pROST : A Smoothed Lp-norm Robust Online Subspace Tracking Method for Realtime Background Subtraction in Video
PROST: Parallel robust online simple tracking
Protecting Real-Time GPU Applications on Integrated CPU-GPU SoC Platforms
Protein alignment algorithms with an efficient backtracking routine on multiple GPUs
Proteus: Efficient Resource Use in Heterogeneous Architectures
Proteus: Exploiting Numerical Precision Variability in Deep Neural Networks
Prototyping methodology of image processing applications on heterogeneous parallel systems
Provably Efficient GPU Algorithms
Providing performance portable numerics for Intel GPUs
Providing Source Code Level Portability Between CPU and GPU with MapCG
PSCToolkit: solving sparse linear systems with a large number of GPUs
Pseudo Random Number Generators on Graphics Processing Units, with Applications in Finance
Pseudo-random number generation for Brownian Dynamics and Dissipative Particle Dynamics simulations on GPU devices
Pseudo-Random Number Generation on GP-GPU
Pseudo-random number generators for Monte Carlo simulations on ATI Graphics Processing Units
Pseudo-random number generators for Monte Carlo simulations on Graphics Processing Units
Pseudorandom number generation on the GPU
Pseudorandom Numbers Generation for Monte Carlo Simulations on GPUs: OpenCL Approach
Pseudoscalar Meson in Two Flavors QCD with the Optimal Domain-Wall Fermion
pSTL-Bench: A Micro-Benchmark Suite for Assessing Scalability of C++ Parallel STL Implementations
PTask: Operating System Abstractions To Manage GPUs as Compute Devices
PTX2Kernel: Converting PTX Code into Compilable Kernels
PUGACE, a cellular Evolutionary Algorithm framework on GPUs
Pulsar Acceleration Searches on the GPU for the Square Kilometre Array
Pulsar search acceleration using FPGAs and OpenCL templates
Pulse-coupled neural network performance for real-time identification of vegetation during forced landing
Purine: A bi-graph based deep learning framework
Pushing the Envelope: Extreme Network Coding on the GPU
Pushing the limit of molecular dynamics with ab initio accuracy to 100 million atoms with machine learning
Pushing the limits for medical image reconstruction on recent standard multicore processors
Putting Automatic Polyhedral Compilation for GPGPU to Work
pVOCL: Power-Aware Dynamic Placement and Migration in Virtualized GPU Environments
PVR: Patch-to-Volume Reconstruction for Large Area Motion Correction of Fetal MRI
PyCOOL – a Cosmological Object-Oriented Lattice code written in Python
PyCUDA and PyOpenCL: A Scripting-Based Approach to GPU Run-Time Code Generation
PyCUDA: GPU Run-Time Code Generation for High-Performance Computing
PyFAI, a versatile library for azimuthal regrouping
PyFAI: a Python library for high performance azimuthal integration on GPU
PyFR: An Open Source Framework for Solving Advection-Diffusion Type Problems on Streaming Architectures using the Flux Reconstruction Approach
pyGSL: A Graph Structure Learning Toolkit
pyJac: analytical Jacobian generator for chemical kinetics
PyMatting: A Python Library for Alpha Matting
pyMIC: A Python Offload Module for the Intel Xeon Phi Coprocessor
pyPaSWAS: Python-based multi-core CPU and GPU sequence alignment
PyPs, a programmable pass manager
Pyramid Methods in GPU-Based Image Processing
Pyramidal Image Blending Using CUDA Framework
PySAGES: flexible, advanced sampling methods accelerated with GPUs
PySchedCL: Leveraging Concurrency in Heterogeneous Data-Parallel Systems
PySPH: A Python framework for SPH
PySPH: a Python-based framework for smoothed particle hydrodynamics
Python for Development of OpenMP and CUDA Kernels for Multidimensional Data
Python Non-Uniform Fast Fourier Transform (PyNUFFT): An Accelerated Non-Cartesian MRI Package on a Heterogeneous Platform (CPU/GPU)
Python Workflows on HPC Systems
Python-Based Quantum Chemistry Calculations with GPU Acceleration
PyTorch Hyperparameter Tuning – A Tutorial for spotPython
PyTorch: An Imperative Style, High-Performance Deep Learning Library
PyTorchPipe: a framework for rapid prototyping of pipelines combining language and vision
PyTransit: Fast and Easy Exoplanet Transit Modelling in Python
q-state Potts model metastability study using optimized GPU-based Monte Carlo algorithms
QArray: a GPU-accelerated constant capacitance model simulator for large quantum dot arrays
QCD on GPUs: cost effective supercomputing
QCD simulations with staggered fermions on GPUs
QCDGPU: open-source package for Monte Carlo lattice simulations on OpenCL-compatible multi-GPU systems
qecGPT: decoding Quantum Error-correcting Codes with Generative Pre-trained Transformers
QGTC: Accelerating Quantized GNN via GPU Tensor Core
Qilin: exploiting parallelism on heterogeneous multiprocessors with adaptive mapping
QMCPACK: An open source ab initio Quantum Monte Carlo package for the electronic structure of atoms, molecules, and solids
QP: A Heterogeneous Multi-Accelerator Cluster
QPACE 2 and Domain Decomposition on the Intel Xeon Phi
Titles: 100
open PDFs: 92
packages: 37