Papers on hgpu.org (.txt-file)
Comparison and Analysis of GPU Energy Effciency For CUDA and OpenCL
Comparison and Analysis of GPU Energy Efficiency For CUDA and OpenCL
Comparison based sorting for systems with multiple GPUs
Comparison between GPU and parallel CPU optimizations in viewshed analysis
Comparison of Cilk, Kaapi and CUDA for the Jacobi Method
Comparison of CPML Implementations for the GPU-Accelerated FDTD Solver
Comparison of different n-body algorithms on various hardware platforms using SYCL
Comparison of Different Parallel Implementaions of the 2+1-Dimensional KPZ Model and the 3-Dimensional KMC Model
Comparison of FPGA and GPU implementations of real-time stereo vision
Comparison of Fragmentation/Dispersion Models for Asteroid Nuclear Disruption Mission Design
Comparison of GPU Architectures for Asynchronous Communication with Finite-Differencing Applications
Comparison of HPC Architectures for Computing All-Pairs Shortest Paths. Intel Xeon Phi KNL vs NVIDIA Pascal
Comparison of Hybrid Sorting Algorithms Implemented on Different Parallel Hardware Platforms
Comparison of OpenCL performance on different platforms using VexCL and Blaze
Comparison of OpenMP & OpenCL Parallel Processing Technologies
Comparison of OpenMP and OpenCL Parallel Processing Technologies
Comparison of parallel sorting algorithms
Comparison of Parallelisation Approaches, Languages, and Compilers for Unstructured Mesh Algorithms on GPUs
Comparison of Random Number Generators in Particle Swarm Optimization Algorithm
Comparison of Rectangular Matrix Multiplication with and without Border Conditions
Comparison of several parallel API for cloth modelling on modern GPUs
Comparison of SPMV performance on matrices with different matrix format using CUSP, cuSPARSE and ViennaCL
Comparison of Technologies for General-Purpose Computing on Graphics Processing Units
Comparison of Thread Execution Methods for GPU-oriented OpenCL Programs on Multicore Processors
COMPASS: a programmable data prefetcher using idle GPU shaders
Compensated Visual Hull for Defective Segmentation and Occlusion
Compensated Visual Hull with GPU-Based Optimization
Compensating Indirect Scattering for Immersive and Semi-Immersive Projection Displays
Competing computational approaches to reaction-diffusion equations in clusters of cells
Compilation and Design Space Exploration of Dataflow Programs for Heterogeneous CPU-GPU Platforms
Compilation for Heterogeneous Computing: Automating Analyses, Transformations and Decisions
Compilation techniques and language support to facilitate dependence-driven computation
Compile-time GPU memory access optimizations
Compiler and runtime support for enabling generalized reduction computations on heterogeneous parallel configurations
Compiler and runtime techniques for bulk-synchronous programming models on CPU architectures
Compiler Assisted Runtime Adaptation
Compiler Fuzzing through Deep Learning
Compiler optimizations for directive-based programming for accelerators
Compiler Optimizations for Industrial Unstructured Mesh CFD Applications on GPUs
Compiler Optimizations for SIMD/GPU/Multicore Architectures
Compiler support for general-purpose computation on GPUs
Compiler Support for High-level GPU Programming
Compiler Technologies in Deep Learning Co-Design: A Survey
Compiler-assisted distribution of OpenMP code for improved scalability
Compiler-Assisted Workload Consolidation For Efficient Dynamic Parallelism on GPU
Compiler-based Data Prefetching and Streaming Non-temporal Store Generation for the Intel Xeon Phi Coprocessor
Compiler-Based Tools to Aid in Data Transfer Optimization and On-Chip Debug of Heterogeneous Compute Systems
Compiler-centric across-stack deep learning acceleration
Compiler-directed memory management for heterogeneous MPSoCs
Compiler-Driven Performance on Heterogeneous Computing Platforms
Compiler-Level Explicit Cache for a GPGPU Programming Framework
CompilerGym: Robust, Performant Compiler Optimization Environments for AI Research
Compilers for Portable Programming of Heterogeneous Parallel & Approximate Computing Systems
Compiling a High-level Directive-Based Programming Model for GPGPUs
Compiling a high-level language for GPUs: (via language support for architectures and compilers)
Compiling an Array Language to a Graphics Processor
Compiling and Optimizing Java 8 Programs for GPU Execution
Compiling and Optimizing OpenMP 4.X Programs to OpenCL and SPIR
Compiling for a heterogeneous vector image processor
Compiling High Performance Recursive Filters
Compiling Parallel Functional Code with Data Parallel Idealised Algol
Compiling Python to a hybrid execution environment
Compiling Stream Applications for Heterogeneous Architectures
Complete PISO and SIMPLE solvers on Graphics Processing Units
Complexity Analysis and Algorithm Design for Reorganizing Data to Minimize Non-Coalesced Memory Accesses on GPU
Complexity effective memory access scheduling for many-core accelerator architectures
Composability of parallel codes on heterogeneous architectures
Composing Distributed Computations Through Task and Kernel Fusion
Composing multiple StarPU applications over heterogeneous machines: a supervised approach
Composition and Reuse with Compiled Domain-Specific Languages
Compositional Compilation for Sparse, Irregular Data Parallelism
Compositional Deep Learning in Futhark
Compound Word Transformer: Learning to Compose Full-Song Music over Dynamic Directed Hypergraphs
Compoundly weighted Voronoi: a sequential and parallel implementation
Comprehensive Analysis of High-Performance Computing Methods for Filtered Back-Projection
Comprehensive Evaluation of OpenCL-based Convolutional Neural Network Accelerators in Xilinx and Altera FPGAs
Comprehensive Evaluations of Cone-beam CT dose in Image-guided Radiation Therapy via GPU-based Monte Carlo simulations
Comprehensive Optimization of Parametric Kernels for Graphics Processing Units
Comprehensive Performance Monitoring for GPU Cluster Systems
Compressed Dynamic Mode Decomposition for Real-Time Object Detection
Compressed Facade Displacement Maps
Compressed Learning of Deep Neural Networks for OpenCL-Capable Embedded Systems
Compressed Multiple-Row Storage Format
Compressed Real Numbers for AI: a case-study using a RISC-V CPU
Compressed sensing using hidden Markov models with application to vision based aircraft tracking
Compressing DMA Engine: Leveraging Activation Sparsity for Training Deep Neural Networks
Compressing Floating-Point Number Stream for Numerical Applications
Compression Domain Volume Rendering
Compression of Deep Convolutional Neural Networks for Fast and Low Power Mobile Applications
Compressive Phase Contrast Tomography
Computation of Air-Vortices Based on GPU Technology: Optimizing and Parallelizing a Model for Wake-Vortex Prediction Using OpenCL
Computation of electron quantum transport in graphene nanoribbons using GPU
Computation of Galois field expressions for quaternary logic functions on GPUs
Computation of gray-level co-occurrence matrix based on CUDA and its optimization
Computation of Large Covariance Matrices by SAMMY on Graphical Processing Units and Multicore CPUs
Computation of the Isogeometric Analysis Stiffness Matrix on GPU
Computation of the Spatial Impulse Response for Ultrasonic Fields on the Graphics Processing Units (GPU)
Computation of Troposphere Slant Delays on a GPU
Computation of Voronoi diagrams using a graphics processing unit
Computation on GPU of Eigenvalues and Eigenvectors of a Large Number of Small Hermitian Matrices
Titles: 100
open PDFs: 94
packages: 16