Papers on hgpu.org (.txt-file)
Deployment of CPU and GPU-based genetic programming on heterogeneous devices
Deployment of parallel linear genetic programming using GPUs on PC and video game console platforms
Depth Estimation using Open Compute Language (OpenCL)
Depth Images: Representations and Real-Time Rendering
Depth Map Based Superresolution Method in 3D Reconstruction
Depth map enhanced macroblock partitioning for H.264 video coding of computer graphics content
Depth-Dependent Halos: Illustrative Rendering of Dense Line Data
Depth-First Search versus Jurema Search on GPU Branch-and-Bound Algorithms: a case study
Depth-of-Field Blur Effects for First-Person Navigation in Virtual Environments
Deriving Shape Grammars on the GPU
Descend: A Safe GPU Systems Programming Language
Design and Analysis of Soft-Error Resilience Mechanisms for GPU Register File
Design and Development of an Efficient H. 264 Video Encoder for CPU/GPU using OpenCL
Design and Development of Optical Flow Based Obstacle Avoidance Using CUDA
Design and evaluation of a parallel k-nearest neighbor algorithm on CUDA-enabled GPU
Design and Evaluation of Scalable Concurrent Queues for Many-Core Architectures
Design and implementation of a high-performance stream-based computing platform on multigenerational GPUs
Design and Implementation of a PTX Emulation Library
Design and Implementation of Centrally-Coordinated Peer-to-Peer Live-streaming
Design and Implementation of CNN-FPGA accelerator based on Open Computing Language
Design and Implementation of GPU-Based Prim’s Algorithm
Design and implementation of MPEG audio layer III decoder using graphics processing units
Design and Implementation of ShenWei Universal C/C++
Design and implementation of software-managed caches for multicores with local memory
Design and Implementation of the Futhark Programming Language
Design and implementation of the Smith-Waterman algorithm on the CUDA-compatible GPU
Design and Modeling of a Non-blocking Checkpointing System
Design and optimization of a portable LQCD Monte Carlo code using OpenACC
Design and optimization of DBSCAN Algorithm based on CUDA
Design and Optimization of Hybrid MD5-Blowfish Encryption on GPUs
Design and Optimization of Image Processing Algorithms on Mobile GPU
Design and Optimization of OpenFOAM-based CFD Applications for Hybrid and Heterogeneous HPC Platforms
Design and Optimization of OpenFOAM-based CFD Applications for Modern Hybrid and Heterogeneous HPC Platforms
Design and Performance Analysis of Parallel Processing of SRTP Packets
Design and performance evaluation of a digital wideband receiver on a hybrid computing platform
Design and Performance Evaluation of a Software Framework for Multi-Physics Simulations on Heterogeneous Supercomputers
Design and Performance Evaluation of Image Processing Algorithms on GPUs
Design and Performance Evaluation of Optimizations for OpenCL FPGA Kernels
Design and Performance of the OP2 Library for Unstructured Mesh Applications
Design and Storage Optimization of GPU-based Parallel Program of Image Registration for Remote Sensing
Design and study of a massively multi threaded shared memory architecture
Design Exploration of AES Accelerators on FPGAs and GPUs
Design Exploration of Quadrature Methods in Option Pricing
Design of 3D FFT on Multi-GPU Clusters
Design of a fully programmable shader processor for low power mobile devices
Design of a Hybrid Memory System for General-Purpose Graphics Processing Units
Design of a parallel AES for graphics hardware using the CUDA framework
Design of a programmable micro-ultrasound research platform
Design of an FPGA-Based FDTD Accelerator Using OpenCL
Design of FPGA-Based Accelerator for Convolutional Neural Network under Heterogeneous Computing Framework with OpenCL
Design of Hardware Accelerator for Lempel-Ziv 4 (LZ4) Compression
Design of high-performance parallelized gene predictors in MATLAB
Design of MILC Lattice QCD Application for GPU Clusters
Design Principles for Sparse Matrix Multiplication on the GPU
Design Space Exploration for GPU-Based Architecture
Design Space Exploration of an OpenCL Based SAXPY Kernel Implementation on FPGAs
Design Space Exploration of Concurrency Mapping to FPGAs in Weather and Climate Applications with Xilinx SDSoC OpenCL, SDSoC C++ and Vivad
Design Space Exploration of OpenCL Applications on Heterogeneous Parallel Platforms
Design Space Exploration of Real-time Bedside and Portable Medical Ultrasound Adaptive Beamformer Acceleration
Design space exploration towards a realtime and energy-aware GPGPU-based analysis of biosensor data
Design Tools for Accelerating Development and Usage of Multi-Core Computing Platforms
Design, Implementation and Performance Evaluation of a Stochastic Gradient Descent Algorithm on CUDA
Design, Implementation and Test of Efficient GPU to GPU Communication Methods
Design, Optimization, and Benchmarking of Dense Linear Algebra Algorithms on AMD GPUs
Designing a high-performance boundary element library with OpenCL and Numba
Designing a Modern Skeleton Programming Framework for Parallel and Heterogeneous Systems
Designing a Unified Programming Model for Heterogeneous Machines
Designing and optimizing compute kernels on NVIDIA GPUs
Designing Bit-Reproducible Portable High-Performance Applications
Designing Efficient Barriers and Semaphores for Graphics Processing Units
Designing Efficient Many-Core Parallel Algorithms for All-Pairs Shortest-Paths Using CUDA
Designing Efficient MPI and UPC Runtime for Multicore Clusters with InfiniBand, Accelerators and Co-Processors
Designing efficient sorting algorithms for manycore GPUs
Designing Fast Architecture Sensitive Tree Search on Modern Multi-Core/Many-Core Processors
Designing Fast LTL Model Checking Algorithms for Many-Core GPUs
Designing Numerical Solvers for Next Generation High Performance Computing
Designing OP2 for GPU architectures
Designing scalable many-core parallel algorithms for min graphs using CUDA
Designing Scientific Applications on GPUs
Designing the Language Liszt for Building Portable Mesh-based PDE Solvers
Detecting Computer Viruses using GPUs
Detecting Data Races on OpenCL Kernels with Symbolic Execution
Detecting multiple periodicities in observational data with the multi-frequency periodogram. II. Frequency Decomposer, a parallelized time-series analysis algorithm
Detecting parametric objects in large scenes by Monte Carlo sampling
Detection of a faint fast-moving near-Earth asteroid using synthetic tracking technique
Detection of collisions and self-collisions using image-space techniques
Detection of retransmissions in 10G Ethernet using GPUs
Determinant Computation on the GPU using the Condensation Method
Determining the difficulty of accelerating problems on a GPU
Deterministic Sample Sort For GPUs
Developing a compiler for the XeonPhi
Developing a CUDA solver for large sparse matrices for MARIN
Developing a High Performance GPGPU Compiler Using Cetus
Developing a High Performance Software Library with MPI and CUDA for Matrix Computations
Developing a massive real-time crowd simulation framework on the GPU
Developing a New Storage Format and a Warp-Based SpMV Kernel for Configuration Interaction Sparse Matrices on the GPU
Titles: 100
open PDFs: 89
packages: 10