Papers on hgpu.org (.txt-file)
A Comparative Study of Neighborhood Filters for Artifact Reduction in Iterative Low-Dose CT
A Comparative Study of OpenACC Implementations
A Comparative Study of Parallel Algorithms for the Girth Problem
A Comparative Study on ASIC, FPGAs, GPUs and General Purpose Processors in the O(N^2) Gravitational N-body Simulation
A Comparative Study on Exact Triangle Counting Algorithms on the GPU
A Comparison between GPU-based Volume Ray Casting Implementations: Fragment Shader, Compute Shader, OpenCL, and CUDA
A comparison between parallelization approaches in molecular dynamics simulations on GPUs
A Comparison of Algebraic Multigrid Preconditioners using Graphics Processing Units and Multi-Core Central Processing Units
A comparison of CPU and GPU performance for Fourier pseudospectral simulations of the Navier-Stokes, Cubic Nonlinear Schrodinger and Sine Gordon Equations
A Comparison of CPU and OpenCL Parallelization Methods for Correlation and Graph Layout Algorithms used in the Network Analysis of High Dimensional Data
A comparison of CPUs, GPUs, FPGAs, and massively parallel processor arrays for random number generation
A Comparison of GPU Execution Time Prediction using Machine Learning and Analytical Modeling
A Comparison of Gradient Estimation Methods for Volume Rendering on Unstructured Meshes
A Comparison of High-Level Design Tools for SoC-FPGA on Disparity Map Calculation Example
A Comparison of Many-threaded Differential Evolution and Genetic Algorithms on CUDA
A Comparison of Massively Parallel Programming Models Through Applications in Sound Propagation and Jitter Measurement
A Comparison of Modern GPU and CPU Architectures: And the Common Convergence of Both
A Comparison of OpenCL, CUDA, and HIP as Compilation Targets for a Functional Array Language
A Comparison of Optimal Scanline Voxelization Algorithms
A comparison of period finding algorithms
A Comparison of Potential Interfaces for Batched BLAS Computations
A Comparison of Sequential and GPU Implementations of Iterative Methods to Compute Reachability Probabilities
A Comparison of Serial & Parallel Particle Filters for Time Series Analysis
A Comparison of Statistical Techniques for Detecting Side-Channel Information Leakage in Cryptographic Devices
A Comparison of Support Vector Machines Training GPU-Accelerated Open Source Implementations
A Comparison of the performance of HPC Accelerators
A Comparison of the Performance of the Molecular Dynamics Simulation Package GROMACS Implemented in the SYCL and CUDA Programming Models
A Comparison of Two Methods for Geometric Milling Simulation Accelerated by GPU
A Comparison of xPU Platforms Exemplified with Ray Tracing Algorithms
A Compile-Time Managed Multi-Level Register File Hierarchy
A Compiler and Runtime for Heterogeneous Computing
A compiler for high performance computing with many-core accelerators
A Compiler for Throughput Optimization of Graph Algorithms on GPUs
A compiler framework for optimization of affine loop nests for gpgpus
A Compiler Framework for Optimizing Dynamic Parallelism on GPUs
A Compiler Infrastructure for Accelerator Generators
A Compiler Infrastructure for Embedded Multicore SoCs
A compiler toolkit for array-based languages targeting CPU/GPU hybrid systems
A Complete and Efficient CUDA-Sharing Solution for HPC Clusters
A Complete Descritpion of the UnPython and Jit4GPU Framework
A complete modular resultant algorithm targeted for realization on graphics hardware
A comprehensive analysis and parallelization of an image retrieval algorithm
A Comprehensive Benchmark of Deep Learning Libraries on Mobile Devices
A Comprehensive Deep Learning Library Benchmark and Optimal Library Selection
A Comprehensive Performance Analysis of HSA and OpenCL 2.0
A Comprehensive Performance Comparison of CUDA and OpenCL
A comprehensive study of Dynamic Memory Management in OpenCL kernels
A Comprehensive Survey on Various Evolutionary Algorithms on GPU
A Computational Comparison of Basis Updating Schemes for the Simplex Algorithm on a CPU-GPU System
A Computational Model of Afterimages
A Computational Realization of a Semi-Lagrangian Method for Solving the Advection Equation
A computationally efficient and scalable approach for privacy preserving kNN classification
A Computationally Efficient Approach for Exemplar-based Color Image Inpainting using GPU
A Computationally Efficient Parallel Kernel Regression for Image Reconstruction
A Compute Unified System Architecture for Graphics Clusters Incorporating Data Locality
A Computing Kernel for Network Binarization on PyTorch
A computing origami: Optimized code generation for emerging parallel platforms
A constant-space belief propagation algorithm for stereo matching
A Consumer Application for GPGPUs: Desktop Search
A Container-Based Workflow for Distributed Training of Deep Learning Algorithms in HPC Clusters
A Contour-Guided Deformable Image Registration Algorithm for Adaptive Radiotherapy
A control-structure splitting optimization for GPGPU
A convex formulation for color image segmentation in the context of passive emitter localization
A Convex Relaxation Approach to Space Time Multi-view 3D Reconstruction
A Convolutional Neural Network Cascade for Face Detection
A CPU and GPU Heterogeneous Processing of Multimedia Data by using OpenCL
A CPU-GPU Hybrid Runtime for the Aeminium Language
A Cross-Input Adaptive Framework for GPU Programs Optimization
A Cross-platform Evaluation of Graphics Shader Compiler Optimization
A CUDA Back-End for the Equelle Compiler
A CUDA Based Implementation of an Image Authentication Algorithm
A CUDA based Solution to the Multidimensional Knapsack Problem Using the Ant Colony Optimization
A CUDA Implementation of Independent Component Analysis in the Time-Frequency Domain
A CUDA implementation of the High Performance Conjugate Gradient benchmark
A CUDA Kernel Scheduler Exploiting Static Data Dependencies
A CUDA Monte Carlo simulator for radiation therapy dosimetry based on Geant4
A CUDA SIMT Interpreter for Genetic Programming
A CUDA SIMT interpreter for genetic programming. Revised
A CUDA-Based Cooperative Evolutionary Multi-Swarm Optimization Applied to Engineering Problems
A CUDA-Based Implementation of Stable Fluids in 3D with Internal and Moving Boundaries
A CUDA-based parallel implementation of K-nearest neighbor algorithm
A CUDA-Based Real Parameter Optimization Benchmark
A CUDA-enabled Parallel Implementation of Collaborative Filtering
A curved-element unstructured discontinuous Galerkin method on GPUs for the Euler equations
A Customized 3D GPU Poisson Solver for Free BCs
A Data Communication Scheduler for Stream Programs on CPU-GPU Platform
A Data Parallel Algorithm for Seismic Raytracing
A data parallel approach to genetic programming using programmable graphics hardware
A data parallel view on polyhedral process networks
A Data-Driven Model for Anisotropic Heterogeneous Subsurface Scattering
A Data-oriented Method for Scheduling Dependent Tasks on High-density Multi-GPU Systems
A Data-Parallel Algorithmic Modelica Extension for Efficient Execution on Multi-Core Platforms
A Data-Parallel Extension to Ruby for GPGPU
A Data-Parallel Graphics Pipeline Implemented in OpenCL
A dataflow-like programming model for future hybrid clusters
A declarative API for particle systems
A decompression pipeline for accelerating out-of-core volume rendering of time-varying data
Titles: 100
open PDFs: 92
packages: 21