Papers on hgpu.org (.txt-file)
A tool for mapping Single Nucleotide Polymorphisms using Graphics Processing Units

A tool set for random number generation on GPUs in R

A Toolkit for Building Dynamic Compilers for Array-Based Languages Targeting CPUs and GPUs

A toolkit to describe and interactively display three-manifolds embedded in four-space

A Training Framework and Architectural Design for Distributed Deep Learning

A training roadmap for new HPC users

A training-free nose tip detection method from face range images
A Translation Framework for Executing the Sequential Binary Code on CPU/GPU Based Architectures

A Translation Framework from RVC-CAL Dataflow Programs to OpenCL/SYCL based Implementations

A translation system for enabling data mining applications on GPUs

A translator framework for Dynamic Programming problems

A trigger system based on Graphics Processing Unit (GPU)
A Tuned and Scalable Fast Multipole Method as a Preeminent Algorithm for Exascale Systems

A Tuned, Concurrent-Kernel Approach to Speed Up the APSP Problem

A Tuning Framework for Software-Managed Memory Hierarchies

A tutorial on the implementations of linear image filters in CPU and GPU

A tutorial overview on the properties of the discrete cosine transform for encoded image and video processing

A two-fluid finite-volume solver based on OpenCL

A two-level real-time vision machine combining coarse- and fine-grained parallelism

A two-level simulator for spaceborne SAR
A two-level task scheduler on Multiple DSP system for OpenCL

A Two-stage Query by Singing/Humming System on GPU

A Unified Approach for Registration and Depth in Depth from Defocus

A Unified Approach to Variable Renaming for Enhanced Vectorization

A Unified FPGA Virtualization Framework for General-Purpose Deep Neural Networks in the Cloud

A Unified Framework for Multi-Sensor HDR Video Reconstruction

A Unified Iteration Space Transformation Framework for Sparse and Dense Tensor Algebra

A Unified Optimization Approach for CNN Model Inference on Integrated GPUs

A Unified Optimization Approach for Sparse Tensor Operations on GPUs

A Unified Optimizing Compiler Framework for Different GPGPU Architectures

A Unified Rolling Shutter and Motion Blur Model for 3D Visual Registration

A Unified Runtime System for Heterogeneous Multi-core Architectures

A unified sparse matrix data format for modern processors with wide SIMD units

A Unified, Hardware-Fitted, Cross-GPU Performance Model

A uniform approach for programming distributed heterogeneous computing systems

A Uniform Platform to Support Multigenerational GPUs for High Performance Stream-based Computing

A University-Industry Collaboration Case Study: Intel Real-Time Multi-View Face Detection Capstone Design Projects

A User’s Guide to KSig: GPU-Accelerated Computation of the Signature Kernel

A Validation Testsuite for OpenACC 1.0

A Variant of Concurrent Constraint Programming on GPU

A Variant of Mersenne Twister Suitable for Graphic Processors

A Variant RSA Acceleration with Parallelization

A Variational Model for Interactive Shape Prior Segmentation and Real-Time Tracking

A Versatile Software Systolic Execution Model for GPU Memory-Bound Kernels

A very fast census-based stereo matching implementation on a graphics processing unit

A Very Simple Approach for 3-D to 2-D Mapping

A Video Deblurring Optimization Algorithm Based on Motion Detection

A view-dependent adaptivity metric for real time mesh tessellation

A Virtual Machine Model for Accelerating Relational Database Joins using a General Purpose GPU

A virtual memory based runtime to support multi-tenancy in clusters with GPUs

A visibility-based approach for occupancy grid computation in disparity space

A Vision for GPU-accelerated Parallel Computation on Geo-Spatial Datasets

A Visual Approach to Investigating Shared and Global Memory Behavior of CUDA Kernels

A volume segmentation approach based on GrabCut

A Watermarking Co-Processor for New Generation Graphics Processing Units

A Way For Accelerating The DNA Sequence Reconstruction Problem By CUDA

A work-efficient GPU algorithm for level set segmentation

A Workload Balanced MapReduce Framework on GPU Platforms

A Wrapper of OpenCL library for gVirtus Framework

A Yoke of Oxen and a Thousand Chickens for Heavy Lifting Graph Processing

ab-Stream: A Framework for programming Many-core

ABC-SysBio–approximate Bayesian computation in Python with GPU support

Abelian: A Compiler for Graph Analytics on Distributed, Heterogeneous Platforms

Abstracting OpenCL for Multi-Application Workloads on CPU-FPGA Clusters

Abstraction and Implementation of Unstructured Grid Algorithms on Massively Parallel Heterogeneous Architectures

Abstractions for C++ code optimizations in parallel high-performance applications

Abstractions for Programming Graphics Processors in High-Level Programming Languages

Abundance Estimation Algorithms using NVIDIA CUDA Technology

ACC Saturator: Automatic Kernel Optimization for Directive-Based GPU Code

Accelerate Cache Simulation with Generic GPU
Accelerate Deep Learning Inference with MCTS in the game of Go on the Intel Xeon Phi

Accelerate Local Tone Mapping for High Dynamic Range Images Using OpenCL with GPU

Accelerate micromagnetic simulations with GPU programming in MATLAB

Accelerate Scientific Deep Learning Models on Heterogeneous Computing Platform with FPGA

Accelerate Smoothed Particle Hydrodynamics using GPU
Accelerate video decoding with generic GPU

Accelerated 2D Image Processing on GPUs
Accelerated Approximate Nearest Neighbors Search Through Hierarchical Product Quantization

Accelerated Combinatorial Optimization using Graphics Processing Units and C++ AMP

Accelerated composite distribution function methods for computational fluid dynamics using GPU

Accelerated Computation of Minimum Enclosing Balls by GPU Parallelization and Distance Filtering

Accelerated cone beam CT reconstruction based on OpenCL
Accelerated cryo-EM structure determination with parallelisation using GPUs in relion-2

Accelerated Deep Learning using Intel Xeon Phi

Accelerated Dictionary Learning with GPU/Multicore CPU and Its Application to Music Classification

Accelerated dimension-independent adaptive Metropolis

Accelerated discovery and design of Fe-Co-Zr magnets with tunable magnetic anisotropy through machine learning and parallel computing

Accelerated Dynamic Programming on GPU: A Study of Speed Up and Programming Approach

Accelerated Event-by-Event Neutrino Oscillation Reweighting with Matter Effects on a GPU

Accelerated Flow Visualization of Advective-Diffusive Mixing Processes Using GPUs

Accelerated GPU Powered Methods for Auditing Security of Wireless Networks Using Probabilistic Password Generation

Accelerated GPU Simulation of Compressible Flow by the Discontinuous Evolution Galerkin Method

Accelerated Large-Scale Multiple Sequence Alignment

Accelerated Matrix Element Method with Parallel Computing

Accelerated MD Program Using CUDA Technology

Accelerated molecular dynamics force evaluation on graphics processing units for thermal conductivity calculations

Accelerated multi-view stereo using parallel processing capababilities of the GPUS

Accelerated Network Coding with Dynamic Stream Decomposition on Graphics Processing Unit

Accelerated Neural Networks on OpenCL Devices Using SYCL-DNN

Titles: 100
open PDFs: 92
packages: 20
