Papers on hgpu.org (.txt-file)
Electromagnetic effects in capacitively coupled plasma simulated with a PIC-MCC darwin code
Electromagnetic transient simulation of large-scale electrical power networks using graphics processing units
Elementary functions: towards automatically generated, efficient, and vectorizable implementations
Elevation-based MRF stereo implemented in real-time on a GPU
EM+TV for Reconstruction of Cone-beam CT with Curved Detectors using GPU
Embedded Ensemble Propagation for Improving Performance, Portability and Scalability of Uncertainty Quantification on Emerging Computational Architectures
Embedded real-time stereo estimation via Semi-Global Matching on the GPU
Embedded Software Synthesis using Heterogeneous Dataflow Models
Embedding GPU Computations in Hadoop
Embedding OpenCL in C++ for Expressive GPU Programming
Embedding OpenCL in GHC Haskell
Embracing Heterogeneity: Parallel Programming for Changing Hardware
Emerging technology about GPGPU
EMMA: an AMR cosmological simulation code with radiative transfer
EmoNets: Multimodal deep learning approaches for emotion recognition in video
Empirical analysis of a parallel data mining algorithm on a graphic processor
Empirical performance modeling of GPU kernels using active learning
Employ Bump Mapping to Enrich the 3D NPR Image
Employing Directive Based Compression Solutions on Accelerators Global Memory under OpenACC
Employing GPU Accelerators for Efficient Enforcement of Data Integrity in Outsourced Data
Employing OpenCL as a Standard Hardware Abstraction in a Distributed Embedded System: A Case Study
Empower Sequence Labeling with Task-Aware Neural Language Model
Empowering Visual Categorization With the GPU
Empty Space Skipping and Occlusion Clipping for Texture-based Volume Rendering
Enabling a High Throughput Real Time Data Pipeline for a Large Radio Telescope Array with GPUs
Enabling active storage on parallel I/O software stacks
Enabling and Scaling Matrix Computations on Heterogeneous Multi-Core and Multi-GPU Systems
Enabling Computational Dynamics in Distributed Computing Environments Using a Heterogeneous Computing Template
Enabling CP2K Application for Exascale Computing with Accelerators using OpenACC and OpenCL
Enabling Data Movement and Computation Pipelining in Deep Learning Compiler
Enabling Development of OpenCL Applications on FPGA platforms
Enabling Efficient Online Profiling of Homogeneous and Heterogeneous Multicore Systems
Enabling Efficient Use of MPI and PGAS Programming Models on Heterogeneous Clusters with High Performance Interconnects
Enabling Energy-Efficient Analysis of Massive Neural Signals Using GPGPU
Enabling Energy-Efficient DNN Training on Hybrid GPU-FPGA Accelerators
Enabling Fast, Noncontiguous GPU Data Movement in Hybrid MPI+GPU Environments
Enabling full-speed random access to the entire memory on the A100 GPU
Enabling High Performance Computing in Cloud Infrastructure using rCUDA
Enabling High Performance Computing in Cloud Infrastructure using Virtualized GPUs
Enabling Inter-Machine Parallelism in High-Level Languages with SEJITS and MapReduce
Enabling multiple accelerator acceleration for Java/OpenMP
Enabling On-Device Smartphone GPU based Training: Lessons Learned
Enabling OpenCL on a Configurable, VLIW Chip-Multiprocessor
Enabling OpenMP Task Parallelism on Multi-FPGAs
Enabling OS Research by Inferring Interactions in the Black-Box GPU Stack
Enabling Quantum Computer Simulations on AMD GPUs: a HIP Backend for Google’s qsim
Enabling task-level scheduling on heterogeneous platforms
Enabling the use of Heterogeneous Computing for Bioinformatics
Enabling Traceability in an MDE Approach to Improve Performance of GPU Applications
Enabling Traceability in MDE to Improve Performance of GPU Applications
Encapsulated synchronization and load-balance in heterogeneous programming
Encrypting video and image streams using OpenCL code on-demand
Encrypting video streams using OpenCL code on-demand
End-to-end data reduction and hardware accelerated rendering techniques for visualizing time-varying non-uniform grid volume data
End-to-end Deep Learning of Optimization Heuristics
End-to-end Mapping in Heterogeneous Systems Using Graph Representation Learning
End-to-end Optimization of Machine Learning Prediction Queries
EnergonAI: An Inference System for 10-100 Billion Parameter Transformer Models
Energy Auto-tuning using the Polyhedral Approach
Energy conservation techniques for GPU computing
Energy Consumption of Algorithms for Solving the Compressible Navier-Stokes Equations on CPU’s, GPU’s and KNL’s
Energy consumption of Graphic Processing Units with respect to automotive use-cases
Energy Efficiency Analysis of GPUs
Energy Efficiency Benefits of Reducing the Voltage Guardband on the Kepler GPU Architecture
Energy efficiency of finite difference algorithms on multicore CPUs, GPUs, and Intel Xeon Phi processors
Energy efficiency of mixed precision iterative refinement methods using hybrid hardware platforms
Energy Efficiency Studies of Mont Blanc Applications
Energy efficiency vs. performance of the numerical solution of PDEs: an application study on a low-power ARM-based cluster
Energy efficient biomolecular simulations with FPGA-based reconfigurable computing
Energy Efficient Computing on Multi-core Processors: Vectorization and Compression Techniques
Energy Efficient Parallel K-Means Clustering for an Intel Hybrid Multi-Chip Package
Energy Evaluation for Applications with Different Thread Affinities on the Intel Xeon Phi
Energy Transfer Ray Tracing with OptiX
Energy-and cost-efficient Lattice-QCD computations using graphics processing units
Energy-aware metrics for benchmarking heterogeneous systems
Energy-aware Task Scheduling with Deadline Constraint in DVFS-enabled Heterogeneous Clusters
Energy-based Tuning of Convolutional Neural Networks on Multi-GPUs
Energy-Efficient Collective Reduce and Allreduce Operations on Distributed GPUs
Energy-efficient computing for extreme-scale science
Energy-efficient Computing on Distributed GPUs using Dynamic Parallelism and GPU-controlled Communication
Energy-Efficient Execution of Data-Parallel Applications on Heterogeneous Mobile Platforms
Energy-Efficient FPGA Implementation for Binomial Option Pricing Using OpenCL
Energy-efficient FPGA Implementation of the k-Nearest Neighbors Algorithm Using OpenCL
Energy-Efficient GPU Clusters Scheduling for Deep Learning
Energy-efficient mechanisms for managing thread context in throughput processors
Energy-optimized mapping of application to smartphone platform – A case study of mobile face recognition
Energy-saving techniques for low-power graphics processing unit
EngineCL: Usability and Performance in Heterogeneous Computing
Engineering a static verification tool for GPU kernels
Engineering Concurrent Software Guided by Statistical Performance Analysis
Engineering of Computer Vision Algorithms Using Evolutionary Algorithms
Enhanced implementation of the NTRUEncrypt algorithm using graphics cards
Enhanced molecular dynamics performance with a programmable graphics processor
Enhanced Parallel ILU (p)-based Preconditioners for Multi-core CPUs and GPUs-The Power (g)-pattern Method
Enhanced Parallel NegaMax Tree Search Algorithm on GPU
Enhancing and Porting the HPC-Lab Snow Simulator to OpenCL on Mobile Platforms
Enhancing Code Portability, Problem Scale, and Storage Efficiency in Exascale Applicationsin Exascale Applications
Enhancing Data Locality for Dynamic Simulations through Asynchronous Data Transformations and Adaptive Control
Titles: 100
open PDFs: 94
packages: 10