Papers on hgpu.org (.txt-file)
A high performance computing framework for physics-based modeling and simulation of military ground vehicles
A High Performance Framework for Coupled Urban Microclimate Models
A High Performance Image Authentication Algorithm on GPU with CUDA
A High Performance Massively Parallel Approach for Real Time Deformable Body Physics Simulation
A High Performance Parallel FDTD Method Enhanced By Using SSE Instruction Set
A High Performance Parallel Sparse Linear Equation Solver Using CUDA
A High Performance Random Number Generator Using Heterogeneous Computing Platform
A High Quality Reflectance Model in Medical Image Visualization
A High-efficiency FPGA-based Accelerator for Convolutional Neural Networks using Winograd Algorithm
A High-Performance Brownian Bridge for GPUs: Lessons for Bandwidth Bound Applications
A High-Performance Computing Cluster for Distributed Deep Learning: A Practical Case of Weed Classification Using Convolutional Neural Network Models
A high-performance fault-tolerant software framework for memory on commodity GPUs
A High-Performance Parallel FDTD Method Enhanced by Using SSE Instruction Set
A High-productivity Framework for Multi-GPU computation of Mesh-based applications
A High-resolution approach for Tsunami impact simulation on graphics processing units
A high-speed multi-GPU implementation of bottom-up attention using CUDA
A high-throughput screening approach to discovering good forms of biologically inspired visual representation
A Highly Efficient Distributed Deep Learning System For Automatic Speech Recognition
A Highly Efficient GPU-CPU Hybrid Parallel Implementation of Sparse LU Factorization
A Highly Extensible Framework for Molecule Dynamic Simulation on GPUs
A Highly Parallel Reuse Distance Analysis Algorithm on GPUs
A Highly Parameterizable Framework for Conditional Restricted Boltzmann Machine Based Workloads Accelerated With FPGAs and OpenCL
A Highly Scalable Solution of an NP-Complete Problem Using CUDA
A Highly-Efficient Memory-Compression Scheme for GPU-Accelerated Intrusion Detection Systems
A hybrid algorithm for parallel molecular dynamics simulations
A Hybrid Analytical DRAM Performance Model
A Hybrid Approach to Parallel Connected Component Labeling Using CUDA
A Hybrid Circular Queue Method for Iterative Stencil Computations on GPUs
A Hybrid Computational Grid Architecture for Comparative Genomics
A Hybrid Computing Platform Digital Wideband Receiver Design and Performance Measurement
A hybrid condensed finite element model with GPU acceleration for interactive 3D soft tissue cutting
A hybrid CPU-GPU parallelization scheme of variable neighborhood search for inventory optimization problems
A Hybrid CPU/GPU Cluster for Encryption and Decryption of Large Amounts of Data
A Hybrid CPU/GPU Pattern-Matching Algorithm for Deep Packet Inspection
A Hybrid Framework for Fast and Accurate GPU Performance Estimation through Source-Level Analysis and Trace-Based Simulation
A Hybrid GPU-FPGA-based Computing Platform for Machine Learning
A Hybrid GPU/CPU FFT Library for Large FFT Problems
A hybrid Hermitian general eigenvalue solver
A Hybrid Method for Computing Apparent Ridges
A Hybrid Multi-GPU Implementation of Simplex Algorithm with CPU Collaboration
A Hybrid Parallel Algorithm for Computing and Tracking Level Set Topology
A hybrid parallel framework for computational solid mechanics
A Hybrid Parallel Implementation of the Aho-Corasick and Wu-Manber Algorithms Using NVIDIA CUDA and MPI Evaluated on a Biological Sequence Database
A Hybrid Parallelization Approach for Distributed and Scalable Deep Learning
A Hybrid Programming Model for Compressible Gas Dynamics Using OpenCL
A Hybrid Software Framework for the GPU Acceleration of Multi-Threaded Monte Carlo Applications
A Hybrid-parallel Architecture for Applications in Bioinformatics
A Hyperelastic Finite-Element Model of Human Skin for Interactive Real-Time Surgical Simulation
A journey from single-GPU to optimized multi-GPU SPH with CUDA
A Kinetic Vlasov Model for Plasma Simulation Using Discontinuous Galerkin Method on Many-Core Architectures
A Language for Describing Optimization Strategies
A Language for Nested Data Parallel Design-space Exploration on GPUs
A Lattice Boltzmann Method Simulator for Microfluidics on GPU Cluster
A Lattice-Preserving Multigrid Method for Solving the Inhomogeneous Poisson Equations Used in Image Analysis
A Light-weight API for Portable Multicore Programming
A Light-Weight Approach to Dynamical Runtime Linking Supporting Heterogenous, Parallel, and Reconfigurable Architectures
A lighting model for fast rendering of forest ecosystems
A Lightweight Approach to Performance Portability with targetDP
A Lightweight, GPU-Based Software RAID System
A Linear Algebra Approach to Fast DNA Mixture Analysis Using GPUs
A linguistic approach to concurrent, distributed, and adaptive programming across heterogeneous platforms
A load balance multi-scheduling model for OpenCL kernel tasks in an integrated cluster
A local diffusion wavelet approach for scattered data registration based on GPU
A Locality-Aware Memory Hierarchy for Energy-Efficient GPU Architectures
A low-cost 3D human interface device using GPU-based optical flow algorithms
A Low-Cost Solution For Excavator Simulation With Realistic Visual Effect
A low-power handheld GPU using logarithmic arithmetic and triple DVFS power domains
A Low-Power Hybrid CPU-GPU Sort
A low-power integrated x86-64 and graphics processor for mobile computing devices
A Machine-Learning Framework for Design for Manufacturability
A Many Threaded CUDA Interpreter for Genetic Programming
A Many-core Machine Model for Designing Algorithms with Minimum Parallelism Overheads
A map reduce framework for programming graphics processors
A Map-Reduce-Like System for Programming and Optimizing Data-Intensive Computations on Emerging Parallel Architectures
A mapping path for multi-GPGPU accelerated computers from a portable high level programming abstraction
A MapReduce Framework for Heterogeneous Computing Architectures
A Markovian event-based framework for stochastic spiking neural networks
A Massive Data Parallel Computational Framework on Petascale/Exascale Hybrid Computer Systems
A massively multicore parallelization of the Kohn-Sham energy gradients
A Massively Parallel Adaptive Fast Multipole Method on Heterogeneous Architectures
A massively parallel adaptive fast-multipole method on heterogeneous architectures
A Massively Parallel Algorithm for Cell Classification Using CUDA
A massively parallel algorithm for constructing the BWT of large string sets
A Massively Parallel Approach for Nonlinear Interdependency Analysis of Multivariate Signals with GPGPU
A Massively Parallel Architecture for Bioinformatics
A Massively Parallel Associative Memory Based on Sparse Neural Networks
A massively parallel framework using P systems and GPUs
A massively parallel implementation of QC-LDPC decoder on GPU
A massively parallel program to solve the phase field formulation for crack propagation
A master-slave robotic simulator based on GPUDirect
A matrix approach to tomographic reconstruction and its implementation on GPUs
A mechanism for balancing accuracy and scope in cross-machine black-box GPU performance modeling
A memory access model for highly-threaded many-core architectures
A Memory Bandwidth-Efficient Hybrid Radix Sort on GPUs
A Memory Centric Kernel Framework for Accelerating Short-Range, Interactive Particle Simulation
A Memory Efficient Algorithm for Adaptive Multidimensional Integration with Multiple GPUs
A Memory Efficient and Fast Sparse Matrix Vector Product on a GPU
A Memory Model for Scientific Algorithms on Graphics Processors
Titles: 100
open PDFs: 87
packages: 10