Papers on hgpu.org (.txt-file)
An investigation of GPU-based stiff chemical kinetics integration methods
An Investigation of the Performance Portability of OpenCL
An Investigation of Unified Memory Access Performance in CUDA
An MDE Approach for Automatic Code Generation from MARTE to OpenCL
An MPI-Based Python Framework for Distributed Training with Keras
An MPI-CUDA Implementation and Optimization for Parallel Sparse Equations and Least Squares (LSQR)
An MPI-CUDA Implementation for Massively Parallel Incompressible Flow Computations on Multi-GPU Clusters
An MPI-CUDA Implementation for the Compression of DEM
An MPI-CUDA implementation of an improved Roe method for two-layer shallow water systems
An N log N Parallel Fast Direct Solver for Kernel Matrices
An octree-based proxy for collision detection in large-scale particle systems
An On-Demand Fast Parallel Pseudo Random Number Generator with Applications
An open framework for rapid prototyping of signal processing applications
An open source finite-difference time-domain solver for room acoustics using graphics processing units
An open source MATLAB program for fast numerical Feynman integral calculations for open quantum system dynamics on GPUs
An Open-source FPGA Library for Data Sorting
An Open-Source GPU-Accelerated Feature Extraction Tool
An OpenCL 3D FFT for Molecular Dynamics Simulations on Multiple FPGAs
An OpenCL design of the Bob Jenkins lookup3 hash function using the Xilinx SDAccel Development Environment
An OpenCL Fast Fourier Transformation
An OpenCL framework for heterogeneous multicores with local memory
An OpenCL implementation for the solution of TDSE on GPU and CPU architectures
An OpenCL implementation of a forward sampling algorithm for CP-logic
An OpenCL Method of Parallel Sorting Algorithms for GPU Architecture
An OpenCL Runtime and Scheduler for Embedded Multicore DSP Parallel Systems
An OpenCL-Based FPGA Accelerator for Faster R-CNN
An OpenCL-based Implementation of H.264 Encoder
An OpenCL-based Monte Carlo dose calculation engine (oclMC) for coupled photon-electron transport
An OpenCL(TM) Deep Learning Accelerator on Arria 10
An OpenMP Programming Environment on Mobile Devices
An optimal k-exclusion real-time locking protocol motivated by multi-GPU systems
An Optimal Offline Permutation Algorithm on the Hierarchical Memory Machine, with the GPU implementation
An optimised multi-baseline approach for on-line MR-temperature monitoring on commodity graphics hardware
An optimised radial basis function algorithm for fast non-rigid registration of medical images
An Optimization for Fast Generation of Digital Hologram
An optimized algorithm for discrete element system analysis using CUDA
An optimized GPU implementation of a 2D free surface simulation model on unstructured meshes
An Optimized GPU Memory Hierarchy Design for an OpenCL Kernel
An Optimized Large-Scale Hybrid DGEMM Design for CPUs and ATI GPUs
An Optimized Multiple Right-Hand Side Dslash Kernel for Intel Xeon Phi
An Optimized Parallel IDCT on Graphics Processing Units
An optimizing multi-platform source-to-source compiler framework for the NEURON MODeling Language
An Out-of-core GPU Approach for Accelerating Geostatistical Interpolation
An Overview of Miscellaneous Applications of GPU Computing
An Overview of Selected Hybrid and Reconfigurable Architectures
An overview of techniques for predicting the performance of GPU accelerated applications
An Overview on the Latest Nature-Inspired and Metaheuristics-Based Image Registration Algorithms
An Ultra-Fast, Optimized and Massively-Parallelized Curvelet Transform Algorithm on GP-GPUs
An Ultrafast Scalable Many-core Motif Discovery Algorithm for Multiple GPUs
An ultrasonic imaging system based on a new SAFT approach and a GPU beamformer
An unsupervised parallel genetic cluster algorithm for graphics processing units
Analysing Astronomy Algorithms for GPUs and Beyond
Analysing the Performance of GPU Hash Tables for State Space Exploration
Analysis & Design of Efficient Cryptographic Systems
Analysis Acceleration in TMVA for the ATLAS Experiment at CERN using GPU Computing
Analysis and Comparison of Performance and Power Consumption of Neural Networks on CPU, GPU, TPU and FPGA
Analysis and implementation of a BLAST-Like algorithm for MIC architectures
Analysis and Implementation of eSTREAM and SHA-3 Cryptographic Algorithms
Analysis and Modeling of the Timing Behavior of GPU Architectures
Analysis and optimization of power consumption in the iterative solution of sparse linear systems on multi-core and many-core platforms
Analysis and Optimization Techniques for Massively Parallel Processors
Analysis and Parameter Prediction of Compiler Transformation for Graphics Processors
Analysis and performance estimation of the conjugate gradient method on multiple GPUs
Analysis and Review of Sorting Algorithms
Analysis of 3-dimensional electromagnetic fields in dispersive media using cuda
Analysis of a Computational Biology Simulation Technique on Emerging Processing Architectures
Analysis of A Splitting Approach for the Parallel Solution of Linear Systems on GPU Cards
Analysis of Genetic Expression with Microarrays using GPU Implemented Algorithms
Analysis of GPGPU Platforms Efficiency in General-Purpose Computations
Analysis of GPU accelerated OpenCL applications on the Intel HD 4600 GPU
Analysis of GPU Parallel Computing based on Matlab
Analysis of GPU-based convolution for acoustic wave propagation modeling with finite differences: Fortran to CUDA-C step-by-step
Analysis of High Level implementations for Recursive Methods on GPUs
Analysis of illumination conditions at the lunar south pole using parallel computing techniques
Analysis of KECCAK Tree Hashing on GPU Architectures
Analysis of Metallic Nanostructures by a Discontinuous Galerkin Time-Domain Maxwell Solver on Graphics Processing Units
Analysis of Multicore CPU and GPU Toward Parallelization of Total Focusing Method Ultrasound Reconstruction
Analysis of Parallel Montgomery Multiplication in CUDA
Analysis of Parallel Sorting Algorithms on Heterogeneous Processors with OpenCL
Analysis of periodic anisotropic media by means of split-field FDTD method and GPU computing
Analysis of periodic structures with GPU accelerating
Analysis of Real-Time Stereo Vision Algorithms On GPU
Analysis of RSA algorithm using GPU programming
Analysis of Single Phase Fluid Flow and Heat Transfer in Slip Flow Regime by Parallel Implementation of Lattice Boltzmann Method on GPUs
Analysis of SuperLU Solvers on Intel MIC Architecture
Analysis of Surface Folding Patterns of DICCCOLS Using the GPU-Optimized Geodesic Field Estimate
Analysis of the Performance of the Fish School Search Algorithm Running in Graphic Processing Units
Analysis-Driven Design of Parallel Floating-Point Matrix Multiplication for Implementation in Reconfigurable Logic
Analysis-driven Engineering of Comparison-based Sorting Algorithms on GPUs
Analytic Anti-Aliasing of Linear Functions on Polytopes
Analytic Antialiasing for Selective High Fidelity Rendering
Analytic Visibility on the GPU
Analytical motion blur rasterization with compression
Analytical Performance Estimation during Code Generation on Modern GPUs
Analytical Study of Various High Performance Computing Paradigms
Analyzing and Improving the Performance of Spatial Database Processing
Analyzing CUDA workloads using a detailed GPU simulator
Analyzing CUDA’s Compiler through the Visualization of Decoded GPU Binaries
Analyzing GPU Performance in Virtualized Environments: A Case Study
Analyzing GPU Tensor Core Potential for Fast Reductions
Titles: 100
open PDFs: 97
packages: 15