Papers on hgpu.org (.txt-file)
Regularity versus Load-Balancing on GPU for treefix computations
Regularization and nonlinearities for neural language models: when are they needed?
Reinforcement Learning Strategies for Compiler Optimization in High level Synthesis
Reionization simulations powered by GPUs I: the structure of the Ultraviolet radiation field
Relational Algorithms for Multi-Bulk-Synchronous Processors
Relational joins on graphics processors
Relational query coprocessing on graphics processors
Relativistic Hydrodynamics on Graphic Cards
Relativistic hydrodynamics on graphics processing units
Relax-Miracle: GPU Parallelization of Semi-Analytic Fourier-Domain solvers for Earthquake Modeling
Reliability modeling of MEMS devices on CUDA based HPC setup
Reliable Initialization of GPU-enabled Parallel Stochastic Simulations Using Mersenne Twister for Graphics Processors
REMODE: Probabilistic, Monocular Dense Reconstruction in Real Time
Remote GPU-Accelerated Online Pre-processing of Raster Maps for Terrain Rendering
Remote Sensing Processing: From Multicore to GPU
Remotely Keyed Cryptographics Secure Remote Display Access Using (Mostly) Untrusted Hardware
Removing the Barrier for FPGA-Based OpenCL Data Center Servers
RenderAnts: Interactive REYES Rendering on GPUs
Rendering Forest Scenes in Real-Time
Rendering of 3D Dynamic Virtual Environments
Rendering Volumetric Haptic Shapes in Mid-Air using Ultrasound
RenderKernel: High-level programming for real-time rendering systems
REOH: Runtime Energy Optimization for Heterogeneous Systems
Reordering GPU Kernel Launches to Enable Efficient Concurrent Execution
Reordering strategy for blocking optimization in sparse linear solvers
Report on the Feasibility of Implementing PIC Codes on a GPU
Report: Performance comparison between C2075 and P100 GPU cards using cosmological correlation functions
Representing Higher-Order Singularities in Vector Fields on Piecewise Linear Surfaces
Reproducible and Accurate Matrix Multiplication for GPU Accelerators
Reproducible Study and Performance Analysis of GPU Programming Paradigms: OpenACC vs. CUDA in Key Linear Algebra Computations
Reproducible Triangular Solvers for High-Performance Computing
Research and Application of Parallel Computing Technologies based on CUDA and OpenCL
Research and Development of Porting SYCL on QNX Operating System for High Parallelism
Research for Chinese Spam Filtering Based on GPU
Research on a Parallel BD-tree Index Structure
Research on ATI-CAL for accelerating FBP reconstruction
Research on CUDA-based Kriging Interpolation Algorithm
Research on double negative materials by using FDTD method based on GPUs
Research on DSP-GPU Heterogeneous Computing System
Research on GPU-accelerated algorithm in 3D finite difference neutron diffusion calculation method
Research on OpenCL optimization for FPGA deep learning application
Research on Parallel DVH Statistic Based on CUDA
Research on Real-Time LLL Imaging Generation Method Based on GPU
Research on the fast Fourier transform of image based on GPU
Research on the simulation of PF-LBM model based on MPI+CUDA mixed granularity parallel
Research on Three-Dimensional Playing Video Technology in Virtual Education Environment
Reservoir Simulation on NVIDIA Tesla GPUs
Resolution of Linear Algebra for the Discrete Logarithm Problem using GPU and Multi-core Architectures
Resolution of the Vlasov-Maxwell system by PIC Discontinuous Galerkin method on GPU with OpenCL
Resolving the conflict between generality and plausibility in verified computation
Resource Centered Computing delivering high parallel performance
Resource Elastic Virtualization for FPGAs using OpenCL
Resource Sharing in GPU-Accelerated Windowing Systems
Resource-Aware Compiler Prefetching for Fine-Grained Many-Cores
Resource-Aware Just-in-Time OpenCL Compiler for Coarse-Grained FPGA Overlays
ReSYCLator: Transforming CUDA C++ source code into SYCL
Retargeting and Respecializing GPU Workloads for Performance Portability
Rethinking resampling in the particle filter on graphics processing units
Rethinking Runtime Verification on Hundreds of Cores: Challenges and Opportunities
Rethinking the Union of Computed Tomography Reconstruction and GPGPU Computing
Returning control to the programmer: SIMD intrinsics for virtual machines
RETURNN: The RWTH Extensible Training framework for Universal Recurrent Neural Networks
Reusable OpenCL FPGA Infrastructure
Reusable software components for accelerator-based clusters
Reuse and Refactoring of GPU Kernels to Design Complex Applications
Reusing Auto-Schedules for Efficient DNN Compilation
Reveal training performance mystery between TensorFlow and PyTorch in the single GPU environment
Reverberant speech recognition combining deep neural networks and deep autoencoders augmented with a phone-class feature
Reverse Computation for Rollback-based Fault Tolerance in Large Parallel Systems: Evaluating the Potential Gains and Systems Effects
Reverse-Mode AD of Reduce-by-Index and Scan in Futhark
Review and Comparative Study of Ray Traversal Algorithms on a Modern GPU Architecture
Review of Memory/Cache Management Technologies used on Heterogeneous Computing Systems
Review: Kd-tree Traversal Algorithms for Ray Tracing
Reviewing GPU architectures to build efficient back projection for parallel geometries
Revision of Relational Joins for Multi-Core and Many-Core Architectures
Revisit Long Short-Term Memory: An Optimization Perspective
Revisiting Actor Programming in C++
Revisiting Co-Processing for Hash Joins on the Coupled CPU-GPU Architecture
Revisiting Edge and Node Parallelism for Dynamic GPU Graph Analytics
Revisiting Online Autotuning for Sparse-Matrix Vector Multiplication Kernels on High-Performance Accelerators
Revisiting Online Autotuning for Sparse-Matrix Vector Multiplication Kernels on Next-Generation Architectures
Revisiting Query Performance in GPU Database Systems
Revisiting sorting for GPGPU stream architectures
Revisiting the Case of ARM SoCs in High-Performance Computing Clusters
Revolutionary technologies for acceleration of emerging petascale applications
RGEM: A Responsive GPGPU Execution Model for Runtime Engines
Rgtsvm: Support Vector Machines on a GPU in R
Ringing: Frugal Subdivision of Curves and Surfaces
Rinnegan: Efficient Resource Use in Heterogeneous Architectures
Ripple: Simplified Large-Scale Computation on Heterogeneous Architectures with Polymorphic Data Layout
Rise of the Graphics Processor
Risk Estimation Without Using Stein’s Lemma — Application to Image Denoising
Ristretto: Hardware-Oriented Approximation of Convolutional Neural Networks
RNA secondary structure prediction using dynamic programming algorithm – A review and proposed work
RNS-Based Elliptic Curve Point Multiplication for Massive Parallel Architectures
RoadRunner: a fast and flexible exoplanet transit model
Roberts edge detection algorithm based on GPU
Robotic approach to multi-beam optical tweezers with Computer Generated Hologram
Robust Adaptive 3-D Segmentation of Vessel Laminae From Fluorescence Confocal Microscope Images and Parallel GPU Implementation
Titles: 100
open PDFs: 93
packages: 16