Papers on hgpu.org (.txt-file)
On-the-Fly Computing on Many-Core Processors in Nuclear Applications

On-the-fly elimination of dynamic irregularities for GPU computing

On-the-fly Generation and Rendering of Infinite Cities on the GPU

On-The-Fly Parallel Data Shuffling for Graph Processing on OpenCL-based FPGAs

Oncilla: A GAS Runtime for Efficient Resource Allocation and Data Movement in Accelerated Clusters

One machine, one minute, three billion tetrahedra

One Stone Two Birds: Synchronization Relaxation and Redundancy Removal in GPU-CPU Translation

One weird trick for parallelizing convolutional neural networks

One-shot tuner for deep learning compilers

oneDNN Graph Compiler: A Hybrid Approach for High-Performance Deep Learning Compilation

Onesweep: A Faster Least Significant Digit Radix Sort for GPUs

Online Adaptive Code Generation and Tuning

Online Energy Optimization in GPUs: A Multi-Armed Bandit Approach

Online Performance Projection for Clusters with Heterogeneous GPUs

Online rapid prototyping of 3D objects using GPU-based 3D cloud computing: Application to 3D face modelling

Online video synthesis for removing occluding objects using multiple uncalibrated cameras via plane sweep algorithm

OP2: An Active Library Framework for Solving Unstructured Mesh-based Applications on Multi-Core and Many-Core Architectures

Opal: A Modular Framework for Optimizing Performance using Analytics and LLMs

Open Source Face Recognition API

Open SYCL on heterogeneous GPU systems: A case of study

Open-source FPGA-ML codesign for the MLPerf Tiny Benchmark

OpenABLext: An automatic code generation framework for agent-based simulations on CPU-GPU-FPGA heterogeneous platforms

OpenACC – First Experiences with Real-World Applications
OpenACC cache Directive: Opportunities and Optimizations

OpenACC Implementations Comparison

OpenACC offloading of the MFC compressible multiphase flow solver on AMD and NVIDIA GPUs

OpenACC-based GPU Acceleration of a 3-D Unstructured Discontinuous Galerkin Method

OpenCL – An effective programming model for data parallel computations at the Cell Broadband Engine

OpenCL + OpenSHMEM Hybrid Programming Model for the Adapteva Epiphany Architecture

OpenCL 2.0 for FPGAs using OCLAcc

OpenCL Accelerated Multi-GPU Cone-Beam Reconstruction

OpenCL Acceleration for TensorFlow

OpenCL Actors – Adding Data Parallelism to Actor-based Programming with CAF

OpenCL and parallel primitives for digital TV applications
OpenCL and the 13 Dwarfs: A Work in Progress

OpenCL API Extensions to achieve Multi-level Parallelism for Efficient Implementation of Strassen’s Matrix Multiplication on GPUs

OpenCL Based Digital Image Projection Acceleration

OpenCL Based High-Quality HEVC Motion Estimation on GPU

OpenCL based machine learning labeling of biomedical datasets

OpenCL embedded profile prototype in mobile device
OpenCL Evaluation for Numerical Linear Algebra Library Development

OpenCL Floating Point Software on Heterogeneous Architectures – Portable or Not?

OpenCL for Database Query Processing

OpenCL for FPGAs: Prototyping a Compiler

OpenCL for programming shared memory multicore CPUs

OpenCL FPGA Optimization guided by memory accesses and roofline model analysis applied to tomography acceleration

OpenCL framework for a CPU, GPU, and FPGA Platform

OpenCL Implementation of a Color Based Object Tracking

OpenCL Implementation of a Parallel Universal Kriging Algorithm for Massive Spatial Data Interpolation on Heterogeneous Systems

OpenCL Implementation of LiDAR Data Processing

OpenCL Implementation of Montgomery Multiplication on FPGA

OpenCL Implementation of Motion Estimation for Cloud Video Processing

OpenCL in Action: How to Accelerate Graphics and Computations
OpenCL JIT Compilation for Dynamic Programming Languages

OpenCL Library for Parallel Graph Search Algorithms

OpenCL Numerical Simulations of Two-Fluid Compressible Flows With a 2D Random Choice Method

OpenCL parallel Processing using General Purpose Graphical Processing units – TiViPE software development

OpenCL Parallel Programming Development Cookbook

OpenCL Performance Evaluation on Modern Multi Core CPUs

OpenCL Performance on the Intel Heterogeneous Architecture Research Platform

OpenCL Performance Prediction using Architecture-Independent Features

OpenCL Programming Guide for Mac

OpenCL programming using Python syntax

OpenCL simulations of two-fluid compressible flows with a random choice method

OpenCL Sparse Linear Solver for Circuit Simulation

OpenCL Task Partitioning in the Presence of GPU Contention

OpenCL Vector Swizzling Optimization under Global Value Numbering

OpenCL vs: Accelerated Finite-Difference Digital Synthesis

OpenCL vs. OpenMP: A Programmability Debate

OpenCL-Accelerated Computation of a 3D SPECT Projection Operator for the Content Adaptive Mesh Model

OpenCL-accelerated object classification in video streams using Spatial Pooler of Hierarchical Temporal Memory

OpenCL-accelerated Point Feature Histogram and Its Application in Railway Track Point Cloud Data Processing

OpenCL-Accelerated Simplified General Perturbations 4 Algorithm

OpenCL-based Algorithm for Heat Load Modelling of District Heating System

OpenCL-based design methodology for application-specific processors

OpenCL-Based Design of an FPGA Accelerator for Phase-Based Correspondence Matching

OpenCL-Based Erasure Coding on Heterogeneous Architectures

OpenCL-Based FPGA Accelerator for 3D FDTD with Periodic and Absorbing Boundary Conditions

OpenCL-Based Implementation of an FPGA Accelerator for Molecular Dynamics Simulation

OpenCL-Based Mobile GPGPU Benchmarking: Methods and Challenges

OpenCL-based optimizations for acceleration of object tracking on FPGAs and GPUs

OpenCL-Darknet: implementation and optimization of OpenCL-based deep learning object detection framework

OpenCL-ready High Speed FPGA Network for Reconfigurable High Performance Computing

OpenCL-Z Android Released on Google Play

OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems

OpenCL: a viable solution for high-performance medical image reconstruction?

OpenCL: Make Ubiquitous Supercomputing Possible
OpenCL/CUDA algorithms for parallel decoding of any irregular LDPC code using GPU

OpenCL/OpenGL aproach for studying active Brownian motion

OpenCLIPER: an OpenCL-based C++ Framework for Overhead-Reduced Medical Image Processing and Reconstruction on Heterogeneous Devices

Titles: 100
open PDFs: 95
packages: 21
