Papers on hgpu.org (.txt-file)
PCIeHLS: an OpenCL HLS framework

PConG: A novel platform available for pervasive computing based on GPU
PDAWL: Profile-based Iterative Dynamic Adaptive WorkLoad Balance on Heterogeneous Architectures

PEAK: A Performance Engineering AI-Assistant for GPU Kernels Powered by Natural Language Transformations

Pedestrian Detection at Warp Speed: Exceeding 500 Detections per Second

Pedestrian detection system based on stereo vision for mobile robot

Pegasus: coordinated scheduling for virtualized accelerator-based systems

PENCIL: A Platform-Neutral Compute Intermediate Language for Accelerator Programming

People detection method using graphics processing units for a mobile robot with an omnidirectional camera

PEPPHER: Efficient and Productive Usage of Hybrid Computing Systems

PEPSC: A Power-Efficient Processor for Scientific Computing

PErasure: a Parallel Cauchy Reed-Solomon Coding Library for GPUs

Perception of Acoustical Spatial Attributes and Impression in Virtually Rendered Sound Field

Perception-aware Depth Cueing for Illustrative Vascular Visualization
Perceptual enhancement of two-level volume rendering

Perceptually Optimized Real-Time Computer Graphics

PERCH 2.0: Fast and Accurate GPU-based Perception via Search for Object Pose Estimation

Percolation study of samples on 2D lattices using GPUs

perf4sight: A toolflow to model CNN training performance on Edge GPUs

Perfect Hashing Structures for Parallel Similarity Searches

PerforatedCNNs: Acceleration through Elimination of Redundant Convolutions

Performance Acceleration of Kernel Polynomial Method Applying Graphics Processing Units

Performance Analysis and Automatic Tuning of Hash Aggregation on GPUs

Performance Analysis and Benchmarking of the Intel SCC

Performance Analysis and Efficient Execution on Systems with multi-core CPUs, GPUs and MICs

Performance Analysis and Improvement of Parallel Differential Evolution

Performance Analysis and Optimisation of the OP2 Framework on Many-core Architectures

Performance analysis and optimization of a CFD application

Performance Analysis and Optimization of a Distributed Processing Framework for Data Mining Accelerated with Graphics Processing Units

Performance Analysis and Optimization of Hermite Methods on NVIDIA GPUs Using CUDA

Performance analysis and optimization of highly diverging algorithms on GPUs

Performance analysis and optimization of the OP2 framework on many-core architectures

Performance analysis and optimization of three-dimensional FDTD on GPU using roofline model
Performance Analysis and Optimization Opportunities for NVIDIA Automotive GPUs

Performance Analysis and Tuning For: General-Purpose Graphics Processing Units (GPGPU)

Performance Analysis Cluster and GPU Computing Environment on Molecular Dynamic Simulation of BRV-1 and REM2 with GROMACS

Performance Analysis for GPU-based Ray-triangle Algorithms

Performance analysis of a 240 thread tournament level MCTS Go program on the Intel Xeon Phi

Performance Analysis of a High-level Abstractions-based Hydrocode on Future Computing Systems

Performance Analysis of a Hybrid MPI/CUDA Implementation of the NAS-LU Benchmark

Performance analysis of a hybrid MPI/CUDA implementation of the NASLU benchmark

Performance Analysis of a Large Memory Application on Multiple Architectures

Performance Analysis of a New Real-Time Elastographic Time Constant Estimator
Performance Analysis of a Novel GPU Computation-to-core Mapping Scheme for Robust Facet Image Modeling

Performance Analysis of a Particle-in-Cell Plasma Physics Code on Homogeneous and Heterogeneous HPC Systems

Performance Analysis of a Stereo Matching Implementation in OpenCL

Performance Analysis of a Symmetric Cryptographic Algorithm on Multicore Architectures

Performance Analysis of a Symmetric Cryptography Algorithm on GPU and GPU Cluster

Performance analysis of accelerated image registration using GPGPU

Performance Analysis of an Astrophysical Simulation Code on the Intel Xeon Phi Architecture

Performance Analysis of an Ultrasound Reconstruction Algorithm for Non Destructive Testing

Performance Analysis of CUDA and OpenCL By Implementation of Cryptographic Algorithms

Performance Analysis of Deep Learning Workloads on Leading-edge Systems

Performance analysis of GPGPU and CPU On AES Encryption

Performance Analysis of GPU Accelerators with Realizable Utilization of Computational Density

Performance Analysis of GPU compared to Single-core and Multi-core CPU for Natural Language Applications

Performance Analysis of GPU-Accelerated Filter-Based Source Finding for HI Spectral Line Image Data

Performance Analysis of GPU-based SAR and Interferometric SAR image processing

Performance Analysis of IBM Cell Broadband Engine on Sequence Alignment

Performance Analysis of Join Algorithms on GPUs

Performance Analysis of kNN on large datasets using CUDA & Pthreads

Performance analysis of matrix-free conjugate gradient kernels using SYCL

Performance analysis of memory transfers and GEMM subroutines on NVIDIA Tesla GPU cluster

Performance analysis of multi-core CPUs and GPU computing on SF-FDTD scheme for third order nonlinear materials and periodic media

Performance Analysis of Open Source Machine Learning Frameworks for Various Parameters in Single-Threaded and Multi-Threaded Modes

Performance analysis of parallel gravitational N-body codes on large GPU cluster

Performance Analysis of Parallel Sorting Algorithms using GPU Computing

Performance Analysis of Roberts Edge Detection Using CUDA and OpenGL

Performance Analysis of Sobel Edge Detection Filter on GPU using CUDA & OpenGL

Performance Analysis of Sobel Edge Filter on Heterogeneous System Using OpenCL

Performance Analysis of Sparse Matrix-Vector Multiplication (SpMV) on Graphics Processing Units (GPUs)

Performance analysis of SSE instructions in multi-core CPUs and GPU computing on FDTD scheme for solid and fluid vibration problems

Performance Analysis of the OP2 Framework on Many-core Architectures

Performance Analysis on Energy Efficient High-Performance Architectures

Performance Analysis on Several GPU Architectures of an Algorithm for Noise Removal

Performance and accuracy of hardware-oriented native-, emulated- and mixed-precision solvers in FEM simulations

Performance and accuracy of hardware-oriented native-, emulated- and mixed-precision solvers in FEM simulations (Part 2: Double Precision GPUs)

Performance and accuracy of Lattice-Boltzmann kernels on multi- and manycore architectures

Performance and Efficiency Analysis of Modern Accelerators: Fine-Grained Parallelism on the Intel Xeon Phi

Performance and energy footprint assessment of FPGAs and GPUs on HPC systems using Astrophysics application

Performance and energy optimization of the iterative solution of sparse linear systems on multicore processors

Performance and numerical accuracy evaluation of heterogeneous multicore systems for Krylov orthogonal basis computation

Performance and Numerical Aspects of Decompositional Factorizations with FP64 Floating-Point Emulation in INT8

Performance and Portability of Accelerated Lattice Boltzmann Applications with OpenACC

Performance and Power Analysis of ATI GPU: A Statistical Approach

Performance and Power Comparisons Between Fermi and Cypress GPUs

Performance and Power Comparisons Between Nvidia and ATI GPUs

Performance and Power Evaluation of AI Accelerators for Training Deep Learning Models

Performance and Power Optimization of GPU Architectures for General-purpose Computing

Performance and Productivity of Parallel Python Programming: A study with a CFD Test Case

Performance and Quality of Random Number Generators

Performance and scalability of Fourier domain optical coherence tomography acceleration using graphics processing units

Performance and Scalability of GPU-Based Convolutional Neural Networks

Performance Assessment of A Multi-block Incompressible Navier-Stokes Solver using Directive-based GPU Programming in a Cluster Environment

Titles: 100
open PDFs: 91
packages: 9
