Papers on hgpu.org (.txt-file)
Divergence Analysis and Optimizations

Divergence Analysis with Affine Constraints

Divide and Conquer G-Buffer Ray Tracing

Divide-and-Conquer 3D Convex Hulls on the GPU

DiVinE-CUDA – A Tool for GPU Accelerated LTL Model Checking

DjiNN and Tonic: DNN as a Service and Its Implications for Future Warehouse Scale Computers

DL: A data layout transformation system for heterogeneous computing

DLIO: A Data-Centric Benchmark for Scientific Deep Learning Applications

DLL: A Blazing Fast Deep Neural Network Library

DMA-Assisted, Intranode Communication in GPU Accelerated Systems

dMath: A Scalable Linear Algebra and Math Library for Heterogeneous GP-GPU Architectures

dMath: Distributed Linear Algebra for DL

DNA sequence alignment: An assignment for OpenMP, MPI, and CUDA/OpenCL

DNN is not all you need: Parallelizing Non-Neural ML Algorithms on Ultra-Low-Power IoT Processors

DNNVM: End-to-End Compiler Leveraging Heterogeneous Optimizations on FPGA-based CNN Accelerators

Doctor AI: Interpretable Deep Learning for Modeling Electronic Health Records

Document Classification Using KNN on GPU

Document Image Binarization Using Image Segmentation Algorithm in Parallel Environment

Document Stream Clustering using GPUs

Dogwild! – Distributed Hogwild for CPU & GPU

Domain Decomposition method on GPU cluster

Domain Specific Languages for High Performance Computing

Domain-Specific Acceleration and Auto-Parallelization of Legacy Scientific Code in FORTRAN 77 using Source-to-Source Compilation

Domain-Specific Code Language Models: Unraveling the Potential for HPC Codes and Tasks

Domain-Specific Languages for Heterogeneous Parallel Computing

Domain-Specific On-Device Object Detection Method

Domain-Specific Optimizations Supporting Real-Time Image Compression

DOPA: GPU-based protein alignment using database and memory access optimizations

dOpenCL – Evaluation of an API-Forwarding Implementation

Dopia: Online Parallelism Management for Integrated CPU/GPU Architectures

Double-Precision Floating-Point Data Visualizations Using Vulkan API

Double-precision FPUs in High-Performance Computing: an Embarrassment of Riches?

Dr.Jit: A Just-In-Time Compiler for Differentiable Rendering

Dragon-Alpha&cu32: A Java-based Tensor Computing Framework With its High-Performance CUDA Library

DRAM Scheduling Policy for GPGPU Architectures Based on a Potential Function

DRiVE: An Example of Distributed Rendering in Virtual Environments

Dropbear: Machine Learning Marketplaces made Trustworthy with Byzantine Model Agreement

Drug Drug Interaction Extraction from Biomedical Literature Using Syntax Convolutional Neural Network

DSDP: A Blind Docking Strategy Accelerated by GPUs

DSPSR: Digital Signal Processing Software for Pulsar Astronomy

DTAM: Dense tracking and mapping in real-time

Dual-RBF based surface reconstruction
Duality based optical flow algorithms with applications

DUODECIM – a structure for point scan compression and rendering

Dust-Dust Collisional Charging and Lightning in Protoplanetary Discs

Dwarfs on Accelerators: Enhancing OpenCL Benchmarking for Heterogeneous Computing Architectures

Dymaxion: Optimizing Memory Access Patterns for Heterogeneous Systems

Dymaxion++: A Directive-based API to Optimize Data Layout and Memory Mapping for Heterogeneous Systems

Dynamic adaptation and distribution of binaries to heterogeneous architectures

Dynamic adaptation of broad phase collision detection algorithms

Dynamic Adaptation Techniques and Opportunities to Improve HPC Runtimes

Dynamic Application Autotuning for Self-Aware Approximate Computing

Dynamic autotuning of adaptive fast multipole methods on hybrid multicore CPU & GPU systems

Dynamic autotuning of SpMV kernel in CUSP library

Dynamic Buffer Overflow Detection for GPGPUs

Dynamic Compilation of Data-Parallel Kernels for Vector Processors

Dynamic Data Management Among Multiple Databases for Optimization of Parallel Computations in Heterogeneous HPC Systems

Dynamic Data Structures for Taskgraph Scheduling Policies with Applications in OpenCL Accelerators

Dynamic deformation textures: GPU-accelerated simulation of deformable models in contact

Dynamic detection of uniform and affine vectors in GPGPU computations

Dynamic Distribution Pruning for Efficient Network Architecture Search

Dynamic Feature-Adaptive Subdivision

Dynamic Fine-Grain Scheduling of Pipeline Parallelism

Dynamic GPU Energy Optimization for Machine Learning Training Workloads

Dynamic Heterogeneous Scheduling Decisions Using Historical Runtime Data

Dynamic IBR Techniques for Fixed Cost Stereoscopic Support

Dynamic Instrumentation and Optimization for GPU Applications

Dynamic Kernel/Device Mapping Strategies for GPU-assisted HPC Systems

Dynamic label placement for improved interactive exploration

Dynamic Load Balancing in GPU-Based Systems – Early Experiments

Dynamic load balancing on heterogeneous multicore/multiGPU systems
Dynamic Load Balancing on Massively Parallel Computer Architectures

Dynamic load balancing on single- and multi-GPU systems

Dynamic Load Balancing Strategies for Graph Applications on GPUs

Dynamic Load Balancing using Graphics Processors

Dynamic loop vectorization for executing OpenCL kernels on CPUs

Dynamic Memory Allocation for OpenCL

Dynamic Memory Management on GPUs with SYCL

Dynamic Orchestration of Massively Data Parallel Execution

Dynamic Overset Grid Computations for CFD Applications on Graphics Processing Units

Dynamic Parallelism in GPU Optimized Barnes Hut Trees for Molecular Dynamics Simulations

Dynamic particle coupling for gpu-based fluid simulation

Dynamic Partitioning-based JPEG Decompression on Heterogeneous Multicore Architectures

Dynamic Programming with CUDA – Part II

Dynamic real-time 4D cardiac MDCT image display using GPU-accelerated volume rendering

Dynamic Sampling and Rendering of Algebraic Point Set Surfaces

Dynamic Scheduling for Large-Scale Distributed-Memory Ray Tracing

Dynamic Scheduling for Work Agglomeration on Heterogeneous Clusters

Dynamic scheduling Monte-Carlo framework for multi-accelerator heterogeneous clusters

Dynamic Scheduling of Parallel Code for Heterogeneous Systems

Dynamic Self-Rescheduling of Tasks over a Heterogeneous Platform
Dynamic Shader Generation for Flexible Multi-Volume Visualization

Dynamic Sparse-Matrix Allocation on GPUs

Dynamic Task Parallelism with a GPU Work-Stealing Runtime System

Dynamic Task-Scheduling and Resource Management for GPU Accelerators in Medical Imaging

Dynamic Translation of Runtime Environments for Heterogeneous Computing

Dynamic Warp Formation and Scheduling for Efficient GPU Control Flow

Titles: 100
open PDFs: 96
packages: 24
