Papers on hgpu.org (.txt-file)
Efficient reconfigurable design for pricing asian options

Efficient reconstruction of biological networks via transitive reduction on general purpose graphics processors

Efficient Relational Algebra Algorithms and Data Structures for GPU

Efficient relational database management using graphics processors

Efficient Rendering of Scenes with Dynamic Lighting Using a Photons Queue and Incremental Update Algorithm

Efficient Resource Scheduling for Big Data Processing on Accelerator-based Heterogeneous Systems

Efficient Resource Sharing Through GPU Virtualization on Accelerated High Performance Computing Systems

Efficient scan-window based object detection using GPGPU

Efficient SDS Simulations on Multi-GPU Nodes of XSEDE High-end Clusters

Efficient Shadows for GPU-based Volume Raycasting

Efficient Shallow Water Simulations on GPUs

Efficient shallow water simulations on GPUs: Implementation, visualization, verification, and validation

Efficient SIMD Vectorization for Hashing in OpenCL

Efficient similarity search on multimedia databases

Efficient simulation of agent-based models on multi-GPU and multi-core clusters

Efficient Simulation of Fluid Flow and Transport in Heterogeneous Media Using Graphics Processing Units (GPUs)

Efficient simulation of large-scale spiking neural networks using CUDA graphics processors

Efficient Simulation of Ocean and Land Scenes Based on Digital Earth

Efficient Simulation Techniques for Large-Scale Applications

Efficient simulations of long wave propagation and runup using a LBM approach on GPGPU hardware

Efficient softmax approximation for GPUs

Efficient Sparse Matrix-Vector Multiplication on CUDA

Efficient Sparse Matrix-Vector Multiplication on GPUs using the CSR Storage Format

Efficient Sparse Matrix-Vector Multiplication on x86-Based Many-Core Processors

Efficient sparse voxel octrees

Efficient Sparse Voxel Octrees – Analysis, Extensions, and Implementation

Efficient Sparse-Dense Matrix-Matrix Multiplication on GPUs Using the Customized Sparse Storage Format

Efficient Spatial Anti-Aliasing Rendering for Line Joins on Vector Maps

Efficient Spatial Binning on the GPU

Efficient spectral and pseudospectral algorithms for 3D simulations of whistler-mode waves in a plasma

Efficient Stack-less BVH Traversal for Ray Tracing

Efficient Static and Dynamic Memory Management Techniques for Multi-GPU Systems

Efficient stream reduction on the GPU

Efficient Support for Matrix Computations on Heterogeneous Multi-core and Multi-GPU Architectures

Efficient Surface Reconstruction From Noisy Data Using Regularized Membrane Potentials

Efficient SVM Training Using Parallel Primal-Dual Interior Point Method on GPU

Efficient Synchronization Primitives for GPUs

Efficient Target and Application Specific Selection and Ordering of Compiler Passes

Efficient Triangle and Quadrilateral Clipping within Shaders

Efficient Two-Level Preconditionined Conjugate Gradient Method on the GPU

Efficient Use of In-Game Ray-Tracing Techniques

Efficient Video Compression via Content-Adaptive Super-Resolution

Efficient Virtual Shadow Maps for Many Lights

Efficient visual hull computation for real-time 3D reconstruction using CUDA

Efficient Volume Rendering in CUDA Path Tracer

Efficient Wave Propagation in Discontinuous Media and Complex Geometry for Many-core Architectures

Efficient Weighted Histogramming on GPUs with CUDA

Efficient Workload Balancing on Heterogeneous GPUs using Mixed-Integer Non-Linear Programming

Efficient XML Path Filtering Using GPUs

Efficient, High-Quality Bayer Demosaic Filtering on GPUs

EfficientBioAI: Making Bioimaging AI Models Efficient in Energy, Latency and Representation

Efficiently Computing Tensor Eigenvalues on a GPU

Efficiently GPU-accelerating long kernel convolutions in 3-D DIRECT TOF PET reconstruction via a kernel decomposition scheme

Efficiently Mapping the AES Encryption Algorithm on GPUs

Efficiently Processing Large Relational Joins on GPUs

Efficiently Training 7B LLM with 1 Million Sequence Length on 8 GPUs

Efficiently Using a CUDA-enabled GPU as Shared Resource

eGPU: A 750 MHz Class Soft GPGPU for FPGA

EIE: Efficient Inference Engine on Compressed Deep Neural Network

EigenCFA: accelerating flow analysis with GPUs

Eigentransport for efficient and accurate all-frequency relighting

Elastic deep learning in multi-tenant GPU cluster

Elastic pipeline: addressing GPU on-chip shared memory bank conflicts
Elastic stream cloud (ESC): A stream-oriented cloud computing platform for Rich Internet Application

Elastically Deformable Models based on the Finite Element Method Accelerated on Graphics Hardware using CUDA

ElastiFace: Matching and Blending Textured Faces

ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators

Electric polarizability of hadrons with overlap fermions on multi-GPUs

Electric potential and field calculation of charged BEM triangles and rectangles by Gaussian cubature

Electrical distribution grid visualization using programmable GPUs

Electrical-Level Attacks on CPUs, FPGAs, and GPUs: Survey and Implications in the Heterogeneous Era

Electromagnetic Computation and Visualization of Transmission Particle Model and its Simulation Based on GPU

Electromagnetic effects in capacitively coupled plasma simulated with a PIC-MCC darwin code
Electromagnetic transient simulation of large-scale electrical power networks using graphics processing units

Elementary functions: towards automatically generated, efficient, and vectorizable implementations

Elevation-based MRF stereo implemented in real-time on a GPU

EM+TV for Reconstruction of Cone-beam CT with Curved Detectors using GPU

Embedded Ensemble Propagation for Improving Performance, Portability and Scalability of Uncertainty Quantification on Emerging Computational Architectures

Embedded real-time stereo estimation via Semi-Global Matching on the GPU

Embedded Software Synthesis using Heterogeneous Dataflow Models

Embedding GPU Computations in Hadoop

Embedding OpenCL in C++ for Expressive GPU Programming

Embedding OpenCL in GHC Haskell

Embracing Heterogeneity: Parallel Programming for Changing Hardware

Emerging technology about GPGPU
EMMA: an AMR cosmological simulation code with radiative transfer

EmoNets: Multimodal deep learning approaches for emotion recognition in video

Empirical analysis of a parallel data mining algorithm on a graphic processor

Empirical performance modeling of GPU kernels using active learning

Employ Bump Mapping to Enrich the 3D NPR Image
Employing Directive Based Compression Solutions on Accelerators Global Memory under OpenACC

Employing GPU Accelerators for Efficient Enforcement of Data Integrity in Outsourced Data

Employing OpenCL as a Standard Hardware Abstraction in a Distributed Embedded System: A Case Study

Empower Sequence Labeling with Task-Aware Neural Language Model

Empowering Visual Categorization With the GPU

Empty Space Skipping and Occlusion Clipping for Texture-based Volume Rendering

Enabling a High Throughput Real Time Data Pipeline for a Large Radio Telescope Array with GPUs

Enabling active storage on parallel I/O software stacks

Enabling and Scaling Matrix Computations on Heterogeneous Multi-Core and Multi-GPU Systems

Enabling Computational Dynamics in Distributed Computing Environments Using a Heterogeneous Computing Template

Titles: 100
open PDFs: 96
packages: 15
