Papers on hgpu.org (.txt-file)
Design and implementation of the Smith-Waterman algorithm on the CUDA-compatible GPU

Design and Modeling of a Non-blocking Checkpointing System

Design and optimization of a portable LQCD Monte Carlo code using OpenACC

Design and optimization of DBSCAN Algorithm based on CUDA

Design and Optimization of Hybrid MD5-Blowfish Encryption on GPUs

Design and Optimization of Image Processing Algorithms on Mobile GPU

Design and Optimization of OpenFOAM-based CFD Applications for Hybrid and Heterogeneous HPC Platforms

Design and Optimization of OpenFOAM-based CFD Applications for Modern Hybrid and Heterogeneous HPC Platforms

Design and Performance Analysis of Parallel Processing of SRTP Packets

Design and performance evaluation of a digital wideband receiver on a hybrid computing platform
Design and Performance Evaluation of a Software Framework for Multi-Physics Simulations on Heterogeneous Supercomputers

Design and Performance Evaluation of Image Processing Algorithms on GPUs
Design and Performance Evaluation of Optimizations for OpenCL FPGA Kernels

Design and Performance of the OP2 Library for Unstructured Mesh Applications

Design and Storage Optimization of GPU-based Parallel Program of Image Registration for Remote Sensing

Design and study of a massively multi threaded shared memory architecture

Design Exploration of AES Accelerators on FPGAs and GPUs

Design Exploration of Quadrature Methods in Option Pricing

Design of 3D FFT on Multi-GPU Clusters

Design of a fully programmable shader processor for low power mobile devices
Design of a Hybrid Memory System for General-Purpose Graphics Processing Units

Design of a parallel AES for graphics hardware using the CUDA framework

Design of a programmable micro-ultrasound research platform

Design of an FPGA-Based FDTD Accelerator Using OpenCL

Design of FPGA-Based Accelerator for Convolutional Neural Network under Heterogeneous Computing Framework with OpenCL

Design of Hardware Accelerator for Lempel-Ziv 4 (LZ4) Compression

Design of high-performance parallelized gene predictors in MATLAB

Design of MILC Lattice QCD Application for GPU Clusters

Design Principles for Sparse Matrix Multiplication on the GPU

Design Space Exploration for GPU-Based Architecture

Design Space Exploration of an OpenCL Based SAXPY Kernel Implementation on FPGAs

Design Space Exploration of Concurrency Mapping to FPGAs in Weather and Climate Applications with Xilinx SDSoC OpenCL, SDSoC C++ and Vivad

Design Space Exploration of OpenCL Applications on Heterogeneous Parallel Platforms

Design Space Exploration of Real-time Bedside and Portable Medical Ultrasound Adaptive Beamformer Acceleration

Design space exploration towards a realtime and energy-aware GPGPU-based analysis of biosensor data

Design Tools for Accelerating Development and Usage of Multi-Core Computing Platforms

Design, Implementation and Performance Evaluation of a Stochastic Gradient Descent Algorithm on CUDA

Design, Implementation and Test of Efficient GPU to GPU Communication Methods

Design, Optimization, and Benchmarking of Dense Linear Algebra Algorithms on AMD GPUs

Designing a high-performance boundary element library with OpenCL and Numba

Designing a Modern Skeleton Programming Framework for Parallel and Heterogeneous Systems

Designing a Unified Programming Model for Heterogeneous Machines

Designing and optimizing compute kernels on NVIDIA GPUs
Designing Bit-Reproducible Portable High-Performance Applications

Designing Efficient Barriers and Semaphores for Graphics Processing Units

Designing Efficient Many-Core Parallel Algorithms for All-Pairs Shortest-Paths Using CUDA
Designing Efficient MPI and UPC Runtime for Multicore Clusters with InfiniBand, Accelerators and Co-Processors

Designing efficient sorting algorithms for manycore GPUs

Designing Fast Architecture Sensitive Tree Search on Modern Multi-Core/Many-Core Processors

Designing Fast LTL Model Checking Algorithms for Many-Core GPUs

Designing Numerical Solvers for Next Generation High Performance Computing

Designing OP2 for GPU architectures

Designing scalable many-core parallel algorithms for min graphs using CUDA
Designing Scientific Applications on GPUs

Designing the Language Liszt for Building Portable Mesh-based PDE Solvers

Detecting Computer Viruses using GPUs

Detecting Data Races on OpenCL Kernels with Symbolic Execution

Detecting multiple periodicities in observational data with the multi-frequency periodogram. II. Frequency Decomposer, a parallelized time-series analysis algorithm

Detecting parametric objects in large scenes by Monte Carlo sampling

Detection of a faint fast-moving near-Earth asteroid using synthetic tracking technique

Detection of collisions and self-collisions using image-space techniques

Detection of retransmissions in 10G Ethernet using GPUs

Determinant Computation on the GPU using the Condensation Method

Determining the difficulty of accelerating problems on a GPU

Deterministic Sample Sort For GPUs

Developing a compiler for the XeonPhi

Developing a CUDA solver for large sparse matrices for MARIN

Developing a High Performance GPGPU Compiler Using Cetus

Developing a High Performance Software Library with MPI and CUDA for Matrix Computations

Developing a massive real-time crowd simulation framework on the GPU

Developing a New Storage Format and a Warp-Based SpMV Kernel for Configuration Interaction Sparse Matrices on the GPU

Developing acquisition systems based on FPGA with OpenCL

Developing an OO Model for Generalized Matrix Multiplication: Preliminary Considerations

Developing and Deploying Advanced Algorithms to Novel Supercomputing Hardware

Developing and Evaluating clOpenCL Applications for Heterogeneous Clusters

Developing Extensible Lattice-Boltzmann Simulators for General-Purpose Graphics-Processing Units

Developing Performance-Portable Molecular Dynamics Kernels in OpenCL

Development and evaluation of a GPU-optimized N-body term for the simulation of biomolecules

Development and evaluation of scalable video motion estimators on GPU
Development methodologies for GPU and cluster of GPUs

Development of a Chemically Reacting Flow Solver on the Graphic Processing Units

Development of a CUDA Implementation of the 3D FDTD Method

Development of a Flow Solver with Complex Kinetics on the Graphic Processing Units

Development of a GPU based two-way time transfer modem
Development of a GPU-accelerated MIKE 21 Solver for Water Wave Dynamics

Development of a GPU-based Monte Carlo dose calculation code for coupled electron-photon transport

Development of a GPU-based multithreaded software application to calculate digitally reconstructed radiographs for radiotherapy

Development of a new framework for high performance volunteer computing

Development of a Restricted Additive Schwarz Preconditioner for Sparse Linear Systems on NVIDIA GPU

Development of a volume rendering system using 3D texture compression techniques on general-purpose personal computers

Development of an Algorithm for Extracting Parallelism and Pipeline Structure from Stream-based Processing flow with Spanning Tree

Development of an explicit pressure-based unstructured solver for three-dimensional incompressible flows with graphics hardware acceleration

Development of an unified FDTD-FEM library for electromagnetic analysis with CPU and GPU computing

Development of Bayesian analysis program for extraction of polarisation observables at CLAS

Development of Generic Scheduling Concepts for OpenGL ES 2.0

Development of High-Performance Software Components for Emerging Architectures

Development of JavaScript-based deep learning platform and application to distributed training

Titles: 100
open PDFs: 90
packages: 11
