Papers on hgpu.org (.txt-file)
On the Portability of GPU-Accelerated Applications via Automated Source-to-Source Translation
On the Portability of the OpenCL Dwarfs on Fixed and Reconfigurable Parallel Platforms
On the Programmability and Performance of Heterogeneous Platforms
On the programmability of multi-GPU computing systems
On the Relation between Anisotropic Diffusion and Iterated Adaptive Filtering
On the Representation of Partially Specified Implementations and its Application to the Optimization of Linear Algebra Kernels on GPU
On the Robust Mapping of Dynamic Programming onto a Graphics Processing Unit
On the Simulations of Evolution-Communication P Systems with Energy without Antiport Rules for GPUs
On the technology roadmap of Free-Viewpoint 3DTV receivers
On the Three P’s of Parallel Programming for Heterogeneous Computing: Performance, Productivity, and Portability
On the type of the temperature phase transition in phi-4 model
On the Usage of GPUs for Efficient Motion Estimation in Medical Image Sequences
On the Use of a GPU-Accelerated Mobile Device Processor for Sound Source Localization
On the Use of an Algebraic Language Interface for Waveform Definition
On the use of deep Boltzmann machines for road signs classification
On the Use of GPUs in Realizing Cost-Effective Distributed RAID
On the Use of Graphic Processing Units for the Efficient Implementation of MIMO Detectors
On the Use of Graphics Processing Units (GPUs) for Molecular Dynamics Simulation of Spherical Particles
On the Use of Remote GPUs and Low-Power Processors for the Acceleration of Scientific Applications
On the Use of Small 2D Convolutions on GPUs
On the utility of graphics cards to perform massively parallel simulation of advanced Monte Carlo methods
On the Validation and Applications of a Parallel Flexible Multi-Body Dynamics Implementation
On the Visualization of Social and other Scale-Free Networks
On the Way to Future’s High Energy Particle Physics Transport Code
On Using GPU to Compute Options and Derivatives
On Vectorization of Deep Convolutional Neural Networks for Vision Tasks
On-Demand Generating and Scheduling Optimised Parallel Applications on Heterogeneous Platforms
On-Demand Source Code Generation & Scheduling Optimised Parallel Applications on Heterogeneous Platforms
On-line free-viewpoint video: From single to multiple view rendering
On-the-Fly Computing on Many-Core Processors in Nuclear Applications
On-the-fly elimination of dynamic irregularities for GPU computing
On-the-fly Generation and Rendering of Infinite Cities on the GPU
On-The-Fly Parallel Data Shuffling for Graph Processing on OpenCL-based FPGAs
Oncilla: A GAS Runtime for Efficient Resource Allocation and Data Movement in Accelerated Clusters
One machine, one minute, three billion tetrahedra
One Stone Two Birds: Synchronization Relaxation and Redundancy Removal in GPU-CPU Translation
One weird trick for parallelizing convolutional neural networks
One-shot tuner for deep learning compilers
oneDNN Graph Compiler: A Hybrid Approach for High-Performance Deep Learning Compilation
Onesweep: A Faster Least Significant Digit Radix Sort for GPUs
Online Adaptive Code Generation and Tuning
Online Energy Optimization in GPUs: A Multi-Armed Bandit Approach
Online Performance Projection for Clusters with Heterogeneous GPUs
Online rapid prototyping of 3D objects using GPU-based 3D cloud computing: Application to 3D face modelling
Online video synthesis for removing occluding objects using multiple uncalibrated cameras via plane sweep algorithm
OP2: An Active Library Framework for Solving Unstructured Mesh-based Applications on Multi-Core and Many-Core Architectures
Open Source Face Recognition API
Open SYCL on heterogeneous GPU systems: A case of study
Open-source FPGA-ML codesign for the MLPerf Tiny Benchmark
OpenABLext: An automatic code generation framework for agent-based simulations on CPU-GPU-FPGA heterogeneous platforms
OpenACC – First Experiences with Real-World Applications
OpenACC cache Directive: Opportunities and Optimizations
OpenACC Implementations Comparison
OpenACC offloading of the MFC compressible multiphase flow solver on AMD and NVIDIA GPUs
OpenACC-based GPU Acceleration of a 3-D Unstructured Discontinuous Galerkin Method
OpenCL – An effective programming model for data parallel computations at the Cell Broadband Engine
OpenCL + OpenSHMEM Hybrid Programming Model for the Adapteva Epiphany Architecture
OpenCL 2.0 for FPGAs using OCLAcc
OpenCL Accelerated Multi-GPU Cone-Beam Reconstruction
OpenCL Acceleration for TensorFlow
OpenCL Actors – Adding Data Parallelism to Actor-based Programming with CAF
OpenCL and parallel primitives for digital TV applications
OpenCL and the 13 Dwarfs: A Work in Progress
OpenCL API Extensions to achieve Multi-level Parallelism for Efficient Implementation of Strassen’s Matrix Multiplication on GPUs
OpenCL Based Digital Image Projection Acceleration
OpenCL Based High-Quality HEVC Motion Estimation on GPU
OpenCL based machine learning labeling of biomedical datasets
OpenCL embedded profile prototype in mobile device
OpenCL Evaluation for Numerical Linear Algebra Library Development
OpenCL Floating Point Software on Heterogeneous Architectures – Portable or Not?
OpenCL for Database Query Processing
OpenCL for FPGAs: Prototyping a Compiler
OpenCL for programming shared memory multicore CPUs
OpenCL FPGA Optimization guided by memory accesses and roofline model analysis applied to tomography acceleration
OpenCL framework for a CPU, GPU, and FPGA Platform
OpenCL Implementation of a Color Based Object Tracking
OpenCL Implementation of a Parallel Universal Kriging Algorithm for Massive Spatial Data Interpolation on Heterogeneous Systems
OpenCL Implementation of LiDAR Data Processing
OpenCL Implementation of Montgomery Multiplication on FPGA
OpenCL Implementation of Motion Estimation for Cloud Video Processing
OpenCL in Action: How to Accelerate Graphics and Computations
OpenCL JIT Compilation for Dynamic Programming Languages
OpenCL Library for Parallel Graph Search Algorithms
OpenCL Numerical Simulations of Two-Fluid Compressible Flows With a 2D Random Choice Method
OpenCL parallel Processing using General Purpose Graphical Processing units – TiViPE software development
OpenCL Parallel Programming Development Cookbook
OpenCL Performance Evaluation on Modern Multi Core CPUs
OpenCL Performance on the Intel Heterogeneous Architecture Research Platform
OpenCL Performance Prediction using Architecture-Independent Features
OpenCL Programming Guide for Mac
OpenCL programming using Python syntax
Titles: 100
open PDFs: 95
packages: 20