Papers on hgpu.org (.txt-file)
Evaluation of autoparallelization toolkits for commodity graphics hardware
Evaluation of computational and energy performance in matrix multiplication algorithms on CPU and GPU using MKL, cuBLAS and SYCL
Evaluation of DGEMM Implementation on Intel Xeon Phi Coprocessor
Evaluation of disconnected quark loops for hadron structure using GPUs
Evaluation of Fermi Features for Data Mining Algorithms
Evaluation of FPGA-based high performance computing platforms
Evaluation of GPU Architectures Using Spiking Neural Networks
Evaluation of GPU-based track-triggering for the CMS detector at CERN’s HL-LHC
Evaluation of Intel’s DPC++ Compatibility Tool in heterogeneous computing
Evaluation of Libraries for Parallel Computing in Haskell – A Case Study with a Super-resolution Application
Evaluation of likelihood functions on CPU and GPU devices
Evaluation of Machine Learning Fameworks on Finis Terrae II
Evaluation of Multi-Threading in Vulkan
Evaluation of OpenAI Codex for HPC Parallel Programming Models Kernel Generation
Evaluation of P-Scheme/G Algorithm for Solving Recurrence Equations
Evaluation of parallel particle swarm optimization algorithms within the CUDA architecture
Evaluation of Pseudo-Random Number Generation on GPU Cards
Evaluation of Rust for GPGPU high-performance computing
Evaluation of Speedup of Monte Carlo Calculations of Two Simple Reactor Physics Problems Coded for the GPU/CUDA Environment
Evaluation of Standardized Password-based Key Derivation against Parallel Processing Platforms
Evaluation of state-of-the-art polyhedral tools for automatic code generation on GPUs
Evaluation of streaming aggregation on parallel hardware architectures
Evaluation of the Intel Xeon Phi and NVIDIA K80 as accelerators for two-dimensional panel codes
Evaluation of the Stability and Performance of a Multi-Stage Riemann Solver in Relativistic Hydrodynamic Simulations
Evaluation of Two Parallel Finite Element Implementations of the Time-Dependent Advection Diffusion Problem: GPU versus Cluster Considering Time and Energy Consumption
Evenly Spaced Streamlines for Surfaces: An Image-Based Approach
Event-Based OpenMP Tasks for Time-Sensitive GPU-Accelerated Systems
Event-driven gate-level simulation with GP-GPUs
Evolution of a double-front Rayleigh-Taylor system using a GPU-based high resolution thermal Lattice-Boltzmann model
Evolution of image filters on graphics processor units using Cartesian Genetic Programming
Evolution of thread-level parallelism in desktop applications
Evolutionary Algorithm for Optimizing Parameters of GPGPU-based Image Segmentation
Evolutionary Clustering on CUDA
Evolutionary Computing on Consumer-Level Graphics Hardware
Evolutionary Quantum Logic Synthesis of Boolean Reversible Logic Circuits Embedded in Ternary Quantum Space using Heuristics
Evolutionary Simulation of Life Using CUDA
Evolving a CUDA kernel from an nVidia template
Evolving CUDA PTX programs by quantum inspired linear genetic programming
Evolving GeneChip correlation predictors on parallel graphics hardware
Evolving gzip matches Kernel from an nVidia CUDA Template
Evolving Neural Networks on GPUs
Evolving Soft Robotic Locomotion in PhysX
EvoTorch: Scalable Evolutionary Computation in Python
EXA2PRO: A Framework for High Development Productivity on Heterogeneous Computing Systems
Exact and complete short read alignment to microbial genomes using GPU programming
Exact and complete short-read alignment to microbial genomes using Graphics Processing Unit programming
Exact calculation of disconnected loops
Exact diagonalization of quantum lattice models on coprocessors
Exact diagonalization of the Hubbard model on graphics processing units
Exact Selectivity Computation for Modern In-Memory Database Query Optimization
Exact Sparse Matrix-Vector Multiplication on GPU’s and Multicore Architectures
Exact Symbolic-Numeric Computation of Planar Algebraic Curves
Examining the Analytic Structure of Green’s Functions: Massive Parallel Complex Integration using GPUs
Example-based volume illustrations
ExaNBody: a HPC framework for N-Body applications
Exascale Deep Learning for Climate Analytics
Exascale Deep Learning for Scientific Inverse Problems
Executing Dynamic Data Rate Actor Networks on OpenCL Platforms
Executing Process Networks on Heterogeneous Platforms using OpenCL
Execution of Compound Multi-Kernel OpenCL Computations in Multi-CPU/Multi-GPU Environments
Exercising high-level parallel programming on streams: a systems biology use case
EXOCHI: architecture and programming environment for a heterogeneous multi-core multithreaded system
Expanding the boundaries of GPU computing
Expanding the VPE-qGM Environment Towards a Parallel Quantum Simulation of Quantum Processes Using GPUs
Expansion Techniques for Collisionless Stellar Dynamical Simulations
Experience Applying Fortran GPU Compilers to Numerical Weather Prediction
Experience Migrating OpenCL to SYCL: A Case Study on Searches for Potential Off-Target Sites of Cas9 RNA-Guided Endonucleases on AMD GPUs
Experience of Migrating a Parallel Graph Coloring Program from CUDA to SYCL
Experience of parallelizing cryo-EM 3D reconstruction on a CPU-GPU heterogeneous system
Experience Report: Writing A Portable GPU Runtime with OpenMP 5.1
Experience with Intel’s Many Integrated Core architecture in ATLAS software
Experiences Building an MLIR-based SYCL Compiler
Experiences Developing the OpenUH Compiler and Runtime Infrastructure
Experiences in Building a Composable and Functional API for Runtime SPIR-V Code Generation
Experiences in Data-Parallel Simulation and Analysis of Complex Systems with Irregular Graph Structures
Experiences in Speeding Up Computer Vision Applications on Mobile Computing Platforms
Experiences in Teaching a Specialty Multicore Computing Course
Experiences Migrating CUDA to SYCL: A Molecular Docking Case Study
Experiences Porting a Molecular Dynamics Code to GPUs on a Cray XK7
Experiences with Achieving Portability across Heterogeneous Architectures
Experiences with Cell-BE and GPU for Tomography
Experiences with High-Level Programming Directives for Porting Applications to GPUs
Experiences with hybrid clusters
Experiences with implementing Kokkos’ SYCL backend
Experiences with Mapping Non-linear Memory Access Patterns into GPUs
Experimental Evaluation of Multiprecision Strategies for GMRES on GPUs
Experimental Evaluation of Thread Distribution Effects on Multiple Output Errors in GPUs
Experimental Fault-Tolerant Synchronization for Reliable Computation on Graphics Processors
Experimentation Procedure for Offloaded Mini-Apps Executed on Cluster Architectures with Xeon Phi Accelerators
Experiments on Parallel Training of Deep Neural Network using Model Averaging
Experiments with Massively Parallel Matrix Multiplication
Experiments with Single Core, Multi-core, and GPU Based Computation of Cellular Automata
Explainable Deep Behavioral Sequence Clustering for Transaction Fraud Detection
Explicit Cache Management for Volume Ray-Casting on Parallel Architectures
Explicit caching HYB: a new high-performance SpMV framework on GPGPU
Explicit Control of Vector Field Based Shape Deformations
Explicit Fourth-Order Runge-Kutta Method on Intel Xeon Phi Coprocessor
Explicit Integration with GPU Acceleration for Large Kinetic Networks
Explicit platform descriptions for heterogeneous many-core architectures
Titles: 100
open PDFs: 95
packages: 20