Papers on hgpu.org (.txt-file)
DNN is not all you need: Parallelizing Non-Neural ML Algorithms on Ultra-Low-Power IoT Processors
DNNVM: End-to-End Compiler Leveraging Heterogeneous Optimizations on FPGA-based CNN Accelerators
Doctor AI: Interpretable Deep Learning for Modeling Electronic Health Records
Document Classification Using KNN on GPU
Document Image Binarization Using Image Segmentation Algorithm in Parallel Environment
Document Stream Clustering using GPUs
Dogwild! – Distributed Hogwild for CPU & GPU
Domain Decomposition method on GPU cluster
Domain Specific Languages for High Performance Computing
Domain-Specific Acceleration and Auto-Parallelization of Legacy Scientific Code in FORTRAN 77 using Source-to-Source Compilation
Domain-Specific Code Language Models: Unraveling the Potential for HPC Codes and Tasks
Domain-Specific Languages for Heterogeneous Parallel Computing
Domain-Specific On-Device Object Detection Method
Domain-Specific Optimizations Supporting Real-Time Image Compression
DOPA: GPU-based protein alignment using database and memory access optimizations
dOpenCL – Evaluation of an API-Forwarding Implementation
Dopia: Online Parallelism Management for Integrated CPU/GPU Architectures
Double-Precision Floating-Point Data Visualizations Using Vulkan API
Double-precision FPUs in High-Performance Computing: an Embarrassment of Riches?
Dr.Jit: A Just-In-Time Compiler for Differentiable Rendering
Dragon-Alpha&cu32: A Java-based Tensor Computing Framework With its High-Performance CUDA Library
DRAM Scheduling Policy for GPGPU Architectures Based on a Potential Function
DRiVE: An Example of Distributed Rendering in Virtual Environments
Dropbear: Machine Learning Marketplaces made Trustworthy with Byzantine Model Agreement
Drug Drug Interaction Extraction from Biomedical Literature Using Syntax Convolutional Neural Network
DSDP: A Blind Docking Strategy Accelerated by GPUs
DSPSR: Digital Signal Processing Software for Pulsar Astronomy
DTAM: Dense tracking and mapping in real-time
Dual-RBF based surface reconstruction
Duality based optical flow algorithms with applications
DUODECIM – a structure for point scan compression and rendering
Dust-Dust Collisional Charging and Lightning in Protoplanetary Discs
Dwarfs on Accelerators: Enhancing OpenCL Benchmarking for Heterogeneous Computing Architectures
Dymaxion: Optimizing Memory Access Patterns for Heterogeneous Systems
Dymaxion++: A Directive-based API to Optimize Data Layout and Memory Mapping for Heterogeneous Systems
Dynamic adaptation and distribution of binaries to heterogeneous architectures
Dynamic adaptation of broad phase collision detection algorithms
Dynamic Adaptation Techniques and Opportunities to Improve HPC Runtimes
Dynamic Application Autotuning for Self-Aware Approximate Computing
Dynamic autotuning of adaptive fast multipole methods on hybrid multicore CPU & GPU systems
Dynamic autotuning of SpMV kernel in CUSP library
Dynamic Buffer Overflow Detection for GPGPUs
Dynamic Compilation of Data-Parallel Kernels for Vector Processors
Dynamic Data Management Among Multiple Databases for Optimization of Parallel Computations in Heterogeneous HPC Systems
Dynamic Data Structures for Taskgraph Scheduling Policies with Applications in OpenCL Accelerators
Dynamic deformation textures: GPU-accelerated simulation of deformable models in contact
Dynamic detection of uniform and affine vectors in GPGPU computations
Dynamic Distribution Pruning for Efficient Network Architecture Search
Dynamic Feature-Adaptive Subdivision
Dynamic Fine-Grain Scheduling of Pipeline Parallelism
Dynamic GPU Energy Optimization for Machine Learning Training Workloads
Dynamic Heterogeneous Scheduling Decisions Using Historical Runtime Data
Dynamic IBR Techniques for Fixed Cost Stereoscopic Support
Dynamic Instrumentation and Optimization for GPU Applications
Dynamic Kernel/Device Mapping Strategies for GPU-assisted HPC Systems
Dynamic label placement for improved interactive exploration
Dynamic Load Balancing in GPU-Based Systems – Early Experiments
Dynamic load balancing on heterogeneous multicore/multiGPU systems
Dynamic Load Balancing on Massively Parallel Computer Architectures
Dynamic load balancing on single- and multi-GPU systems
Dynamic Load Balancing Strategies for Graph Applications on GPUs
Dynamic Load Balancing using Graphics Processors
Dynamic loop vectorization for executing OpenCL kernels on CPUs
Dynamic Memory Allocation for OpenCL
Dynamic Memory Management on GPUs with SYCL
Dynamic Orchestration of Massively Data Parallel Execution
Dynamic Overset Grid Computations for CFD Applications on Graphics Processing Units
Dynamic Parallelism in GPU Optimized Barnes Hut Trees for Molecular Dynamics Simulations
Dynamic particle coupling for gpu-based fluid simulation
Dynamic Partitioning-based JPEG Decompression on Heterogeneous Multicore Architectures
Dynamic Programming with CUDA – Part II
Dynamic real-time 4D cardiac MDCT image display using GPU-accelerated volume rendering
Dynamic Sampling and Rendering of Algebraic Point Set Surfaces
Dynamic Scheduling for Large-Scale Distributed-Memory Ray Tracing
Dynamic Scheduling for Work Agglomeration on Heterogeneous Clusters
Dynamic scheduling Monte-Carlo framework for multi-accelerator heterogeneous clusters
Dynamic Scheduling of Parallel Code for Heterogeneous Systems
Dynamic Self-Rescheduling of Tasks over a Heterogeneous Platform
Dynamic Shader Generation for Flexible Multi-Volume Visualization
Dynamic Sparse-Matrix Allocation on GPUs
Dynamic Task Parallelism with a GPU Work-Stealing Runtime System
Dynamic Task-Scheduling and Resource Management for GPU Accelerators in Medical Imaging
Dynamic Translation of Runtime Environments for Heterogeneous Computing
Dynamic Warp Formation and Scheduling for Efficient GPU Control Flow
Dynamic warp formation: Efficient MIMD control flow on SIMD graphics hardware
Dynamic Warp Resizing in High-Performance SIMT
Dynamic Workload Division in GPU-CPU Heterogeneous Systems
Dynamical heterogeneities as fingerprints of a backbone structure in Potts models
Dynamical simulations of extrasolar planetary systems with debris disks using a GPU accelerated N-body code
Dynamically Finding Optimal Kernel Launch Parameters for CUDA Programs
Dynamically Managed Data for CPU-GPU Architectures
Dynamically scheduled Cholesky factorization on multicore architectures with GPU accelerators
Dynamically tuned push-relabel algorithm for the maximum flow problem on CPU-GPU-Hybrid platforms
DynaProg for Scala: A Scala DSL for Dynamic Programming on CPU and GPU
DySel: Lightweight Dynamic Selection for Kernel-based Data-parallel Programming Model
E-MOGA: A General Purpose Platform for Multi Objective Genetic Algorithm running on CUDA
E(A+M)PEC – An OpenCL Atomic and Molecular Plasma Emission Code For Interstellar Medium Simulations
E2C: A Visual Simulator to Reinforce Education of Heterogeneous Computing Systems
Titles: 100
open PDFs: 96
packages: 24