Papers on hgpu.org (.txt-file)
Developing acquisition systems based on FPGA with OpenCL
Developing an OO Model for Generalized Matrix Multiplication: Preliminary Considerations
Developing and Deploying Advanced Algorithms to Novel Supercomputing Hardware
Developing and Evaluating clOpenCL Applications for Heterogeneous Clusters
Developing Extensible Lattice-Boltzmann Simulators for General-Purpose Graphics-Processing Units
Developing Performance-Portable Molecular Dynamics Kernels in OpenCL
Development and evaluation of a GPU-optimized N-body term for the simulation of biomolecules
Development and evaluation of scalable video motion estimators on GPU
Development methodologies for GPU and cluster of GPUs
Development of a Chemically Reacting Flow Solver on the Graphic Processing Units
Development of a CUDA Implementation of the 3D FDTD Method
Development of a Flow Solver with Complex Kinetics on the Graphic Processing Units
Development of a GPU based two-way time transfer modem
Development of a GPU-accelerated MIKE 21 Solver for Water Wave Dynamics
Development of a GPU-based Monte Carlo dose calculation code for coupled electron-photon transport
Development of a GPU-based multithreaded software application to calculate digitally reconstructed radiographs for radiotherapy
Development of a Restricted Additive Schwarz Preconditioner for Sparse Linear Systems on NVIDIA GPU
Development of a volume rendering system using 3D texture compression techniques on general-purpose personal computers
Development of an Algorithm for Extracting Parallelism and Pipeline Structure from Stream-based Processing flow with Spanning Tree
Development of an explicit pressure-based unstructured solver for three-dimensional incompressible flows with graphics hardware acceleration
Development of an unified FDTD-FEM library for electromagnetic analysis with CPU and GPU computing
Development of Bayesian analysis program for extraction of polarisation observables at CLAS
Development of Generic Scheduling Concepts for OpenGL ES 2.0
Development of High-Performance Software Components for Emerging Architectures
Development of JavaScript-based deep learning platform and application to distributed training
Development of Krylov and AMG linear solvers for large-scale sparse matrices on GPUs
Development of methods for the processing of mining images using genetic algorithms
Development of nonlinear filter bank system for real-time beautification of facial video using GPGPU
Development of Parallel Architectures for Radar/Video Signal Processing Applications
Development of Parallel Computation Tools
Development of Virtual Machine Tool for Simulation and Evaluation
Developmental Directions in Parallel Accelerators
Device Placement Optimization with Reinforcement Learning
Device specialization in heterogeneous multi-GPU environments
Devito: automated fast finite difference computation
DFG Implementation on Multi GPU Cluster with Computation-Communication Overlap
DGEMM on Integer Matrix Multiplication Unit
Diagnosing Performance Bottlenecks in HPC Applications
Diagnosis, Tuning, and Redesign for Multicore Performance: A Case Study of the Fast Multipole Method
Diagrammatic Determinantal Quantum Monte Carlo Calculations on GPUs
DIANNE: Distributed Artificial Neural Networks for the Internet of Things
Diderot: A Parallel DSL for Image Analysis and Visualization
Different Optimization Strategies and Performance Evaluation of Reduction on Multicore CUDA Architecture
Differential evolution algorithm on the GPU with C-CUDA
Differential Evolution with parallelised objective functions using CUDA
Diffusion Curves: A Vector Representation for Smooth-Shaded Images
Digital beamforming using a GPU
Digital Marbling: a GPU Approach with Precomputed Velocity Field
Digital Signal Processing using Stream High Performance Computing: A 512-input Broadband Correlator for Radio Astronomy
Digitize Your Body and Action in 3-D at Over 10 FPS: Real Time Dense Voxel Reconstruction and Marker-less Motion Tracking via GPU Acceleration
Diplomat: Mapping of multi-kernel applications using a static dataflow abstraction
Direct Communication Methods for Distributed GPUs
Direct deconvolution of radio synthesis images using L1 minimisation
Direct evaluation of NURBS curves and surfaces on the GPU
Direct GPU Compilation and Execution for Host Applications with OpenMP Parallelism
Direct GPU/FPGA Communication Via PCI Express
Direct N-body code on low-power embedded ARM GPUs
Direct N-body Kernels for Multicore Platforms
Direct N-body simulations of globular clusters: (I) Palomar 14
Direct Numeric Simulation of Sheared Convective Boundary Layer Entrainment with GPUs
Direct Numerical Simulation and Large Eddy Simulation on a Turbulent Wall-Bounded Flow Using Lattice Boltzmann Method and Multiple GPUs
Direct numerical simulation of sub-grid structures in gas-solid flow — GPU implementation of macro-scale pseudo-particle modeling
Direct Numerical Simulation of Turbulence on Heterogenous Computer Systems: Architectures, Algorithms, and Applications
Direct Numerical Simulation of Turbulent Flows with Parallel Algorithms for Various Computing Architectures
Direct Self-Consistent Field Computations on GPU Clusters
Direct solution of the Boltzmann equation for a binary mixture on GPUs
Direct Visualization of Particle-Partition of Unity Data
Direct-to-indirect transfer for cinematic relighting
directCell: hybrid systems with tightly coupled accelerators
Directionally Unsplit Hydrodynamic Schemes with Hybrid MPI/OpenMP/GPU Parallelization in AMR
Directive-based Approach to Heterogeneous Computing
Directive-Based Compilers for GPUs
Directive-Based Data Partitioning and Pipelining and Auto-Tuning for High-Performance GPU Computing
Directive-Based Partitioning and Pipelining for Graphical Processing Units
Directive-Based, High-Level Programming and Optimizations for High-Performance Computing with FPGAs
Directives Based Programming of GPU Accelerated Systems
DISC: A Dynamic Shape Compiler for Machine Learning Workloads
Disc: Approximative Nearest Neighbor Search using Ellipsoids for Photon Mapping on GPUs
Discontinuous Galerkin Methods on Graphics Processing Units for Nonlinear Hyperbolic Conservation Laws
Discontinuous Galerkin Time Domain for Maxwell’s equations on GPUs
Discrete fourier transform on multicore
Discrete Planning Unit Look-ahead Velocity Control Strategy and Parallelization Research based on GPU
Discrete Shearlet Transform on GPU with Applications in Anomaly Detection and Denoising
Discrete Wavelet Transform on Consumer-Level Graphics Hardware
Discrete-event Execution Alternatives on General Purpose Graphical Processing Units (GPGPUs)
Discriminative Convolutional Sum-Product Networks on GPU
Dispersion Simulation and Visualization For Urban Security
Displacement Mapping on the GPU – State of the Art
Dissecting GPU Memory Hierarchy through Microbenchmarking
Dissecting Tensor Cores via Microbenchmarks: Latency, Throughput and Numerical Behaviors
Dissecting the NVidia Turing T4 GPU via Microbenchmarking
Dissecting the NVIDIA Volta GPU Architecture via Microbenchmarking
DISTAL: The Distributed Tensor Algebra Compiler
Distance field transform with an adaptive iteration method
Distance Fields Accelerated with OpenCL
Distance Threshold Similarity Searches on Spatiotemporal Trajectories using GPGPU
Titles: 100
open PDFs: 92
packages: 15