Papers on hgpu.org (.txt-file)
Evaluating Performance Portability of OpenACC

Evaluating Performance Tradeoffs on the Radeon Open Compute Platform

Evaluating polynomials in several variables and their derivatives on a GPU computing processor

Evaluating Reconfigurable Dataflow Computing Using the Himeno Benchmark

Evaluating the Arm Ecosystem for High Performance Computing

Evaluating the capabilities of the Xeon Phi platform in the context of software-only, thread-level speculation

Evaluating the cell broadband engine as a platform to run estimation of distribution algorithms

Evaluating the Efficiency of CPUs, GPUs and FPGAs on a Near-Duplicate Document Detection Via OpenCL

Evaluating the Energy Efficiency of OpenCL-accelerated AutoDock Molecular Docking

Evaluating the impact of reordering unstructured meshes on the performance of finite volume GPU solvers

Evaluating the Performance and Energy Efficiency of N-Body Codes on Multi-Core CPUs and GPUs

Evaluating the Performance and Portability of Contemporary SYCL Implementations

Evaluating the Performance and Portability of OpenCL

Evaluating the Performance Impact of Multiple Streams on the MIC-based Heterogeneous Platform

Evaluating the performance of HPC-style SYCL applications

Evaluating the Performance of Legacy Applications on Emerging Parallel Architectures

Evaluating the Performance of NVIDIA’s A100 Ampere GPU for Sparse Linear Algebra Computations

Evaluating the Performance of Processing Medical Volume Data on Graphics Hardware

Evaluating the Performance of the DeepSeek Model in Confidential Computing Environment

Evaluating the performance portability of SYCL across CPUs and GPUs on bandwidth-bound applications

Evaluating the potential of graphics processors for high performance embedded computing
Evaluating the Power of GPU Acceleration for IDW Interpolation Algorithm

Evaluating the use of GPUs in liver image segmentation and HMMER database searches

Evaluating the Viability of Application-Driven Cooperative CPU/GPU Fault Detection

Evaluating the Wide Area Classroom After 24,000 HPC Students

Evaluating tradeoff between recall and performance of GPU permutation index

Evaluation and enhancement of memory efficiency targeting general-purpose computations on scalable data-parallel GPU architectures

Evaluation and Improvement of GPU Ray Tracing with a Thread Migration Technique

Evaluation and tuning of the Level 3 CUBLAS for graphics processors

Evaluation Framework for GPU Performance Based on OpenCL Standard
Evaluation iterative solver for pCDR on GPU accelerator

Evaluation of an accelerator architecture for Speckle Reducing Anisotropic Diffusion

Evaluation of an OpenCL-Based FPGA Platform for Particle Filter

Evaluation of autoparallelization toolkits for commodity graphics hardware

Evaluation of computational and energy performance in matrix multiplication algorithms on CPU and GPU using MKL, cuBLAS and SYCL

Evaluation of DGEMM Implementation on Intel Xeon Phi Coprocessor

Evaluation of disconnected quark loops for hadron structure using GPUs

Evaluation of Fermi Features for Data Mining Algorithms

Evaluation of FPGA-based high performance computing platforms

Evaluation of GPU Architectures Using Spiking Neural Networks

Evaluation of GPU-based track-triggering for the CMS detector at CERN’s HL-LHC

Evaluation of Intel’s DPC++ Compatibility Tool in heterogeneous computing

Evaluation of Libraries for Parallel Computing in Haskell – A Case Study with a Super-resolution Application

Evaluation of likelihood functions on CPU and GPU devices

Evaluation of Machine Learning Fameworks on Finis Terrae II

Evaluation of Multi-Threading in Vulkan

Evaluation of OpenAI Codex for HPC Parallel Programming Models Kernel Generation

Evaluation of P-Scheme/G Algorithm for Solving Recurrence Equations

Evaluation of parallel particle swarm optimization algorithms within the CUDA architecture
Evaluation of Pseudo-Random Number Generation on GPU Cards

Evaluation of Rust for GPGPU high-performance computing

Evaluation of Speedup of Monte Carlo Calculations of Two Simple Reactor Physics Problems Coded for the GPU/CUDA Environment

Evaluation of Standardized Password-based Key Derivation against Parallel Processing Platforms

Evaluation of state-of-the-art polyhedral tools for automatic code generation on GPUs

Evaluation of streaming aggregation on parallel hardware architectures

Evaluation of the Intel Xeon Phi and NVIDIA K80 as accelerators for two-dimensional panel codes

Evaluation of the Stability and Performance of a Multi-Stage Riemann Solver in Relativistic Hydrodynamic Simulations

Evaluation of Two Parallel Finite Element Implementations of the Time-Dependent Advection Diffusion Problem: GPU versus Cluster Considering Time and Energy Consumption

Evenly Spaced Streamlines for Surfaces: An Image-Based Approach

Event-Based OpenMP Tasks for Time-Sensitive GPU-Accelerated Systems

Event-driven gate-level simulation with GP-GPUs

EvoEngineer: Mastering Automated CUDA Kernel Code Evolution with Large Language Models

Evolution of a double-front Rayleigh-Taylor system using a GPU-based high resolution thermal Lattice-Boltzmann model

Evolution of image filters on graphics processor units using Cartesian Genetic Programming

Evolution of Kernels: Automated RISC-V Kernel Optimization with Large Language Models

Evolution of thread-level parallelism in desktop applications

Evolutionary Algorithm for Optimizing Parameters of GPGPU-based Image Segmentation

Evolutionary Clustering on CUDA

Evolutionary Computing on Consumer-Level Graphics Hardware

Evolutionary Quantum Logic Synthesis of Boolean Reversible Logic Circuits Embedded in Ternary Quantum Space using Heuristics

Evolutionary Simulation of Life Using CUDA

Evolving a CUDA kernel from an nVidia template

Evolving CUDA PTX programs by quantum inspired linear genetic programming

Evolving GeneChip correlation predictors on parallel graphics hardware

Evolving gzip matches Kernel from an nVidia CUDA Template

Evolving Neural Networks on GPUs

Evolving Soft Robotic Locomotion in PhysX

EvoTorch: Scalable Evolutionary Computation in Python

exa-AMD: An Exascale-Ready Framework for Accelerating the Discovery and Design of Functional Materials

EXA2PRO: A Framework for High Development Productivity on Heterogeneous Computing Systems

Exact and complete short read alignment to microbial genomes using GPU programming

Exact and complete short-read alignment to microbial genomes using Graphics Processing Unit programming

Exact calculation of disconnected loops

Exact diagonalization of quantum lattice models on coprocessors

Exact diagonalization of the Hubbard model on graphics processing units

Exact Selectivity Computation for Modern In-Memory Database Query Optimization

Exact Sparse Matrix-Vector Multiplication on GPU’s and Multicore Architectures

Exact Symbolic-Numeric Computation of Planar Algebraic Curves

Examining the Analytic Structure of Green’s Functions: Massive Parallel Complex Integration using GPUs

Example-based volume illustrations

ExaNBody: a HPC framework for N-Body applications

Exascale Deep Learning for Climate Analytics

Exascale Deep Learning for Scientific Inverse Problems

Executing Dynamic Data Rate Actor Networks on OpenCL Platforms

Executing Process Networks on Heterogeneous Platforms using OpenCL

Execution of Compound Multi-Kernel OpenCL Computations in Multi-CPU/Multi-GPU Environments

Exercising high-level parallel programming on streams: a systems biology use case

EXOCHI: architecture and programming environment for a heterogeneous multi-core multithreaded system

Expanding the boundaries of GPU computing

Expanding the VPE-qGM Environment Towards a Parallel Quantum Simulation of Quantum Processes Using GPUs

Titles: 100
open PDFs: 96
packages: 17
