Papers on hgpu.org (.txt-file)
End-to-end Deep Learning of Optimization Heuristics

End-to-end Mapping in Heterogeneous Systems Using Graph Representation Learning

End-to-end Optimization of Machine Learning Prediction Queries

EnergonAI: An Inference System for 10-100 Billion Parameter Transformer Models

Energy Auto-tuning using the Polyhedral Approach

Energy conservation techniques for GPU computing

Energy Consumption of Algorithms for Solving the Compressible Navier-Stokes Equations on CPU’s, GPU’s and KNL’s

Energy consumption of Graphic Processing Units with respect to automotive use-cases

Energy Efficiency Analysis of GPUs

Energy Efficiency Benefits of Reducing the Voltage Guardband on the Kepler GPU Architecture

Energy efficiency of finite difference algorithms on multicore CPUs, GPUs, and Intel Xeon Phi processors

Energy efficiency of mixed precision iterative refinement methods using hybrid hardware platforms

Energy Efficiency Studies of Mont Blanc Applications

Energy efficiency vs. performance of the numerical solution of PDEs: an application study on a low-power ARM-based cluster

Energy efficient biomolecular simulations with FPGA-based reconfigurable computing
Energy Efficient Computing on Multi-core Processors: Vectorization and Compression Techniques

Energy Efficient Parallel K-Means Clustering for an Intel Hybrid Multi-Chip Package

Energy Evaluation for Applications with Different Thread Affinities on the Intel Xeon Phi

Energy Transfer Ray Tracing with OptiX

Energy-and cost-efficient Lattice-QCD computations using graphics processing units

Energy-aware metrics for benchmarking heterogeneous systems

Energy-aware Task Scheduling with Deadline Constraint in DVFS-enabled Heterogeneous Clusters

Energy-based Tuning of Convolutional Neural Networks on Multi-GPUs

Energy-Efficient Collective Reduce and Allreduce Operations on Distributed GPUs

Energy-efficient computing for extreme-scale science

Energy-efficient Computing on Distributed GPUs using Dynamic Parallelism and GPU-controlled Communication

Energy-Efficient Execution of Data-Parallel Applications on Heterogeneous Mobile Platforms

Energy-Efficient FPGA Implementation for Binomial Option Pricing Using OpenCL

Energy-efficient FPGA Implementation of the k-Nearest Neighbors Algorithm Using OpenCL

Energy-Efficient GPU Clusters Scheduling for Deep Learning

Energy-efficient mechanisms for managing thread context in throughput processors

Energy-optimized mapping of application to smartphone platform – A case study of mobile face recognition

Energy-saving techniques for low-power graphics processing unit
EngineCL: Usability and Performance in Heterogeneous Computing

Engineering a static verification tool for GPU kernels

Engineering Concurrent Software Guided by Statistical Performance Analysis

Engineering of Computer Vision Algorithms Using Evolutionary Algorithms

Engineering Supercomputing Platforms for Biomolecular Applications

Enhanced implementation of the NTRUEncrypt algorithm using graphics cards
Enhanced molecular dynamics performance with a programmable graphics processor

Enhanced Parallel ILU (p)-based Preconditioners for Multi-core CPUs and GPUs-The Power (g)-pattern Method

Enhanced Parallel NegaMax Tree Search Algorithm on GPU

Enhancing and Porting the HPC-Lab Snow Simulator to OpenCL on Mobile Platforms

Enhancing Code Portability, Problem Scale, and Storage Efficiency in Exascale Applicationsin Exascale Applications

Enhancing Data Locality for Dynamic Simulations through Asynchronous Data Transformations and Adaptive Control

Enhancing Data Parallelism for Ant Colony Optimisation on GPUs

Enhancing data parallelism for Ant Colony Optimization on GPUs

Enhancing Deployment-Time Predictive Model Robustness for Code Analysis and Optimization

Enhancing Depth-Perception with Flexible Volumetric Halos

Enhancing Efficiency of the RRTMG Radiation Code with GPU and MIC Approaches for Numerical Weather Prediction Models

Enhancing Fluid Modeling with Turbulence and Acceleration

Enhancing GPU Parallelism in Nature-Inspired Algorithms

Enhancing Performance for Solving Finite Element Mesh using Heterogeneous Platforms

Enhancing Performance of Meshfree Methods by Hybrid Computing

Enhancing Performance of Simulations using GPGPU

Enhancing Productivity and Performance Portability of General-Purpose Parallel Programming

Enhancing productivity and performance portability of OpenCL applications on heterogeneous systems using runtime optimizations

Enhancing R with Advanced Compilation Tools and Methods

Enhancing the Performance Portability of Heterogeneous Circuit Analysis Programs

Enhancing the simulation of P systems for the SAT problem on GPUs

Enhancing Transformer Performance and Portability through Auto-tuning Frameworks

Enhancing Ubiquitous Systems through System Call Mining

Ensemble K-Means on Modern Many Core Hardware

Ensemble K-means on multi-core architectures

Entropy-based High Performance Computation of Boolean SNP-SNP Interactions Using GPUs

Environment Lighting for Point Sampled Geometry
Environment Segmentation in Service Robotics

EPEM: A General and Validated Energy Complexity Model for Multithreaded Algorithms

EPSILOD: efficient parallel skeleton for generic iterative stencil computations in distributed GPUs

Equalizer 2.0 – Convergence of a Parallel Rendering Framework

Equalizer: A Scalable Parallel Rendering Framework

Equilibrium and Non-Equilibrium Ising Models by Means of PCA

EQUIPE: Parallel equivalence checking with GP-GPUs

Error Resilience Evaluation on GPGPU Applications

Error-bounded GPU-supported terrain visualisation

ESE: Efficient Speech Recognition Engine with Compressed LSTM on FPGA

Espresso: A Fast End-to-end Neural Speech Recognition Toolkit

Espresso: Efficient Forward Propagation for BCNNs

Estimating GPU Speedups for Programs Without Writing a Single Line of GPU Code

Estimating the WCET of GPU-Accelerated Applications using Hybrid Analysis

Estimation of numerical reproducibility on CPU and GPU

Estimation of Skin Optical Parameters for Real-Time Hyperspectral Imaging Applications using GPGPU Parallel Computing

Evacuation Route Modeling and Planning with General Purpose GPU Computing

Evaluating 3-D Stencil codes on Intel Xeon Phi: Limitations and Trade-offs

Evaluating CP2K on Exascale Hardware: Intel Xeon Phi

Evaluating different Java bindings for OpenCL

Evaluating force field accuracy with long-time simulations of a beta-hairpin tryptophan zipper peptide

Evaluating FPGA Accelerator Performance with a Parameterized OpenCL Adaptation of the HPCChallenge Benchmark Suite

Evaluating GPU Passthrough in Xen for High Performance Cloud Computing

Evaluating GPUs for network packet signature matching

Evaluating graph coloring on GPUs

Evaluating High-Level Synthesis Techniques for Scalable Hardware-Accelerated Computing

Evaluating kernels on Xeon Phi to accelerate Gysela application

Evaluating multi-core platforms for HPC data-intensive kernels

Evaluating one-sided programming models for GPU cluster computations

Evaluating Operators in Deep Neural Networks for Improving Performance Portability of SYCL

Evaluating performance and portability of OpenCL programs

Evaluating Performance Portability of Accelerator Programming Models using SPEC ACCEL 1.2 Benchmarks

Titles: 100
open PDFs: 96
packages: 20
