Papers on hgpu.org (.txt-file)
Enabling CP2K Application for Exascale Computing with Accelerators using OpenACC and OpenCL

Enabling Data Movement and Computation Pipelining in Deep Learning Compiler

Enabling Development of OpenCL Applications on FPGA platforms

Enabling Efficient Online Profiling of Homogeneous and Heterogeneous Multicore Systems

Enabling Efficient Use of MPI and PGAS Programming Models on Heterogeneous Clusters with High Performance Interconnects

Enabling Energy-Efficient Analysis of Massive Neural Signals Using GPGPU

Enabling Energy-Efficient DNN Training on Hybrid GPU-FPGA Accelerators

Enabling Fast, Noncontiguous GPU Data Movement in Hybrid MPI+GPU Environments

Enabling full-speed random access to the entire memory on the A100 GPU

Enabling High Performance Computing in Cloud Infrastructure using rCUDA

Enabling High Performance Computing in Cloud Infrastructure using Virtualized GPUs

Enabling Inter-Machine Parallelism in High-Level Languages with SEJITS and MapReduce

Enabling multiple accelerator acceleration for Java/OpenMP

Enabling On-Device Smartphone GPU based Training: Lessons Learned

Enabling OpenCL on a Configurable, VLIW Chip-Multiprocessor

Enabling OpenMP Task Parallelism on Multi-FPGAs

Enabling OS Research by Inferring Interactions in the Black-Box GPU Stack

Enabling Profile Guided Optimizations (PGO) for Graphics

Enabling Quantum Computer Simulations on AMD GPUs: a HIP Backend for Google’s qsim

Enabling task-level scheduling on heterogeneous platforms

Enabling the use of Heterogeneous Computing for Bioinformatics

Enabling Traceability in an MDE Approach to Improve Performance of GPU Applications

Enabling Traceability in MDE to Improve Performance of GPU Applications

Encapsulated synchronization and load-balance in heterogeneous programming

Encrypting video and image streams using OpenCL code on-demand

Encrypting video streams using OpenCL code on-demand

End-to-end data reduction and hardware accelerated rendering techniques for visualizing time-varying non-uniform grid volume data

End-to-end Deep Learning of Optimization Heuristics

End-to-end Mapping in Heterogeneous Systems Using Graph Representation Learning

End-to-end Optimization of Machine Learning Prediction Queries

EnergonAI: An Inference System for 10-100 Billion Parameter Transformer Models

Energy Auto-tuning using the Polyhedral Approach

Energy conservation techniques for GPU computing

Energy Consumption of Algorithms for Solving the Compressible Navier-Stokes Equations on CPU’s, GPU’s and KNL’s

Energy consumption of Graphic Processing Units with respect to automotive use-cases

Energy Efficiency Analysis of GPUs

Energy Efficiency Benefits of Reducing the Voltage Guardband on the Kepler GPU Architecture

Energy efficiency of finite difference algorithms on multicore CPUs, GPUs, and Intel Xeon Phi processors

Energy efficiency of mixed precision iterative refinement methods using hybrid hardware platforms

Energy Efficiency Studies of Mont Blanc Applications

Energy efficiency vs. performance of the numerical solution of PDEs: an application study on a low-power ARM-based cluster

Energy efficient biomolecular simulations with FPGA-based reconfigurable computing
Energy Efficient Computing on Multi-core Processors: Vectorization and Compression Techniques

Energy Efficient Parallel K-Means Clustering for an Intel Hybrid Multi-Chip Package

Energy Evaluation for Applications with Different Thread Affinities on the Intel Xeon Phi

Energy Transfer Ray Tracing with OptiX

Energy-and cost-efficient Lattice-QCD computations using graphics processing units

Energy-aware metrics for benchmarking heterogeneous systems

Energy-aware Task Scheduling with Deadline Constraint in DVFS-enabled Heterogeneous Clusters

Energy-based Tuning of Convolutional Neural Networks on Multi-GPUs

Energy-Efficient Collective Reduce and Allreduce Operations on Distributed GPUs

Energy-efficient computing for extreme-scale science

Energy-efficient Computing on Distributed GPUs using Dynamic Parallelism and GPU-controlled Communication

Energy-Efficient Execution of Data-Parallel Applications on Heterogeneous Mobile Platforms

Energy-Efficient FPGA Implementation for Binomial Option Pricing Using OpenCL

Energy-efficient FPGA Implementation of the k-Nearest Neighbors Algorithm Using OpenCL

Energy-Efficient GPU Clusters Scheduling for Deep Learning

Energy-efficient mechanisms for managing thread context in throughput processors

Energy-optimized mapping of application to smartphone platform – A case study of mobile face recognition

Energy-saving techniques for low-power graphics processing unit
EngineCL: Usability and Performance in Heterogeneous Computing

Engineering a static verification tool for GPU kernels

Engineering Concurrent Software Guided by Statistical Performance Analysis

Engineering of Computer Vision Algorithms Using Evolutionary Algorithms

Engineering Supercomputing Platforms for Biomolecular Applications

Enhanced implementation of the NTRUEncrypt algorithm using graphics cards
Enhanced molecular dynamics performance with a programmable graphics processor

Enhanced Parallel ILU (p)-based Preconditioners for Multi-core CPUs and GPUs-The Power (g)-pattern Method

Enhanced Parallel NegaMax Tree Search Algorithm on GPU

Enhancing and Porting the HPC-Lab Snow Simulator to OpenCL on Mobile Platforms

Enhancing Code Portability, Problem Scale, and Storage Efficiency in Exascale Applicationsin Exascale Applications

Enhancing Data Locality for Dynamic Simulations through Asynchronous Data Transformations and Adaptive Control

Enhancing Data Parallelism for Ant Colony Optimisation on GPUs

Enhancing data parallelism for Ant Colony Optimization on GPUs

Enhancing Deployment-Time Predictive Model Robustness for Code Analysis and Optimization

Enhancing Depth-Perception with Flexible Volumetric Halos

Enhancing Efficiency of the RRTMG Radiation Code with GPU and MIC Approaches for Numerical Weather Prediction Models

Enhancing Fluid Modeling with Turbulence and Acceleration

Enhancing GPU Parallelism in Nature-Inspired Algorithms

Enhancing Performance for Solving Finite Element Mesh using Heterogeneous Platforms

Enhancing Performance of Meshfree Methods by Hybrid Computing

Enhancing Performance of Simulations using GPGPU

Enhancing Productivity and Performance Portability of General-Purpose Parallel Programming

Enhancing productivity and performance portability of OpenCL applications on heterogeneous systems using runtime optimizations

Enhancing R with Advanced Compilation Tools and Methods

Enhancing the Performance Portability of Heterogeneous Circuit Analysis Programs

Enhancing the simulation of P systems for the SAT problem on GPUs

Enhancing Transformer Performance and Portability through Auto-tuning Frameworks

Enhancing Ubiquitous Systems through System Call Mining

Ensemble K-Means on Modern Many Core Hardware

Ensemble K-means on multi-core architectures

Entropy-based High Performance Computation of Boolean SNP-SNP Interactions Using GPUs

Environment Lighting for Point Sampled Geometry
Environment Segmentation in Service Robotics

EPEM: A General and Validated Energy Complexity Model for Multithreaded Algorithms

EPSILOD: efficient parallel skeleton for generic iterative stencil computations in distributed GPUs

Equalizer 2.0 – Convergence of a Parallel Rendering Framework

Titles: 100
open PDFs: 96
packages: 13
