Papers on hgpu.org (.txt-file)
Realtime Two-Way Coupling of Meshless Fluids and Nonlinear FEM
Recent Advances on GPU Computing in Operations Research
Recent algorithm and machine developments for lattice QCD
Recent progress and challenges in exploiting graphics processors in computational fluid dynamics
Recent trends in software and hardware for GPGPU computing: A comprehensive survey
Reconfigurable Control Variate Monte-Carlo Designs for Pricing Exotic Options
Reconfigurable real-time MIMO detector on GPU
Reconstructing hash reversal based proof of work schemes
Reconstruction and visualization of planetary nebulae
Record Setting Software Implementation of DES Using CUDA
Recovering Historical Climate Records using Artificial Neural Networks in GPU
Recurrence quantification analysis in images with CUDA
Recurrent Neural Networks for anomaly detection in the Post-Mortem time series of LHC superconducting magnets
Recurrent neural networks for language modeling
Recurrent Neural Networks Hardware Implementation on FPGA
Recursive MIS Computation for Streaming BDPT on the GPU
Redco: A Lightweight Tool to Automate Distributed Training of LLMs on Any GPU/TPUs
Redefining the Role of the CPU in the Era of CPU-GPU Integration
Redução de Complexidade de Tempo em GPUs
Reduce, Reuse, Recycle (R^3): a Design Methodology for Sparse Matrix Vector Multiplication on Reconfigurable Platforms
Reduced Vlasov-Maxwell simulations
Reducing Beamforming Calculation Time with GPU Accelerated Algorithms
Reducing branch divergence in GPU programs
Reducing branch divergence to speed up parallel execution of unit testing on GPUs
Reducing data access latency in SDSM systems using runtime optimizations
Reducing GPU Offload Latency via Fine-Grained CPU-GPU Synchronization
Reducing IO bandwidth for GPU based moment invariant classifier systems
Reducing overheads of dynamic scheduling on heterogeneous chips
Reducing shading on GPUs using quad-fragment merging
Reducing Synchronous GPU Memory Transfers: Design and implementation of a Futhark compiler optimisation
Reducing the Code Degree Of Parallelism to Increase GPUs Reliability
Reducing the Cost of Heuristic Generation with Machine Learning
Reducing the Disk IO Bandwidth Bottleneck through Fast Floating Point Compression using Accelerators
Reducing the Size of Nurbs Controls Nets Using Genetic Algorithms and CUDA
Reducing thread divergence in a GPU-accelerated branch-and-bound algorithm
Reducing Thread Divergence in GPU-based B and B Applied to the Flow-shop problem
Reducing Thread Divergence in GPU-based B&B Applied to the Flow-shop problem
Reduction of a Symmetrical Matrix to Tridiagonal Form on GPUs
Redwood: Flexible and Portable Heterogeneous Tree Traversal Workloads
Refinements in Syntactic Parsing
Refining HPCToolkit for application performance analysis at exascale
Reflective Shadow Map Clustering for Real-Time Global Illumination
Reflector Antenna Analysis using Physical Optics on Graphics Processing Units
Refresh Rate Modulation for Perceptually Optimized Computer Graphics
ReGen: Optimizing Genetic Selection Algorithms for Heterogeneous Computing
Region Templates: Data Representation and Management for Large-Scale Image Analysis
Regional Heritability Advanced Complex Trait Analysis for GPU and Traditional Parallel Architectures
Register packing for cyclic reduction: a case study
Register-leaning kernels in CUDA
Regression Modelling of Power Consumption for Heterogeneous Processors
Regular Expression Matching and Operational Semantics
Regular Expression Matching on Graphics Hardware for Intrusion Detection
Regular Lattice and Small-World Spin Model Simulations Using CUDA and GPUs
Regularity versus Load-Balancing on GPU for treefix computations
Regularization and nonlinearities for neural language models: when are they needed?
Reinforcement Learning Strategies for Compiler Optimization in High level Synthesis
Reionization simulations powered by GPUs I: the structure of the Ultraviolet radiation field
Relational Algorithms for Multi-Bulk-Synchronous Processors
Relational joins on graphics processors
Relational query coprocessing on graphics processors
Relativistic Hydrodynamics on Graphic Cards
Relativistic hydrodynamics on graphics processing units
Relax-Miracle: GPU Parallelization of Semi-Analytic Fourier-Domain solvers for Earthquake Modeling
Reliability modeling of MEMS devices on CUDA based HPC setup
Reliable Initialization of GPU-enabled Parallel Stochastic Simulations Using Mersenne Twister for Graphics Processors
REMODE: Probabilistic, Monocular Dense Reconstruction in Real Time
Remote GPU-Accelerated Online Pre-processing of Raster Maps for Terrain Rendering
Remote Sensing Processing: From Multicore to GPU
Remotely Keyed Cryptographics Secure Remote Display Access Using (Mostly) Untrusted Hardware
Removing the Barrier for FPGA-Based OpenCL Data Center Servers
RenderAnts: Interactive REYES Rendering on GPUs
Rendering Forest Scenes in Real-Time
Rendering of 3D Dynamic Virtual Environments
Rendering Volumetric Haptic Shapes in Mid-Air using Ultrasound
RenderKernel: High-level programming for real-time rendering systems
REOH: Runtime Energy Optimization for Heterogeneous Systems
Reordering GPU Kernel Launches to Enable Efficient Concurrent Execution
Reordering strategy for blocking optimization in sparse linear solvers
Report on the Feasibility of Implementing PIC Codes on a GPU
Report: Performance comparison between C2075 and P100 GPU cards using cosmological correlation functions
Representing Higher-Order Singularities in Vector Fields on Piecewise Linear Surfaces
Reproducible and Accurate Matrix Multiplication for GPU Accelerators
Reproducible Study and Performance Analysis of GPU Programming Paradigms: OpenACC vs. CUDA in Key Linear Algebra Computations
Reproducible Triangular Solvers for High-Performance Computing
Research and Application of Parallel Computing Technologies based on CUDA and OpenCL
Research and Development of Porting SYCL on QNX Operating System for High Parallelism
Research for Chinese Spam Filtering Based on GPU
Research on a Parallel BD-tree Index Structure
Research on ATI-CAL for accelerating FBP reconstruction
Research on CUDA-based Kriging Interpolation Algorithm
Research on double negative materials by using FDTD method based on GPUs
Research on DSP-GPU Heterogeneous Computing System
Research on GPU-accelerated algorithm in 3D finite difference neutron diffusion calculation method
Research on OpenCL optimization for FPGA deep learning application
Research on Parallel DVH Statistic Based on CUDA
Research on Real-Time LLL Imaging Generation Method Based on GPU
Research on the fast Fourier transform of image based on GPU
Research on the simulation of PF-LBM model based on MPI+CUDA mixed granularity parallel
Titles: 100
open PDFs: 93
packages: 15