Papers on hgpu.org (.txt-file)
Effective Dynamic Scheduling on Heterogeneous Multi/Manycore Desktop Platforms
Effective Extensible Programming: Unleashing Julia on GPUs
Effective GPU Sharing Under Compiler Guidance
Effective GPU Strategies for LU Decomposition
Effective Mapping of Grammatical Evolution to CUDA Hardware Model
Effective Multi-Modal Retrieval based on Stacked Auto-Encoders
Effective Parallelization of Non-bonded Interactions Kernel for Virtual Screening on GPUs
Effective Sparse Matrix Representation for the GPU Architectures
Effectiveness of GPGPU for Solving the Magnetohydrodynamics Equations Using the CIP-MOCCT Method
Effectiveness of program transformations and compilers for directive-based GPU programming models
Effects of Compiler Optimizations in OpenMP to CUDA Translation
Effects of compression on data intensive algorithms
Effects of Concurrency Techniques and Algorithm Performance: A Comparative Analysis of Single-Threaded, Multi-Threaded, and GPGPU Programming Techniques
Effects of Dynamic Voltage and Frequency Scaling on a K20 GPU
Effects of Easy Hybrid Parallelization with CUDA for Numerical-Atomic-Orbital Density Functional Theory Calculation
Effects of GPU and CPU Loads on Performance of CUDA Applications
Effects of OpenCL-Based Parallelization Methods on Explicit Numerical Methods to Solve the Heat Equation
EFFEX: an embedded processor for computer vision based feature extraction
Efficacy of Images Versus Data Buffers: Optimizing Interactive Applications Utilizing OpenCL for Scientific Visualization
Efficent multiple pass, multiple output algorithms on the GPU
Efficiency analysis of a physical problem: Different parallel computational approaches for a dynamical integrator evolution
Efficiency Considerations of Cauchy Reed-Solomon Implementations on Accelerator and Multi-Core Platforms
Efficiency of general Krylov methods on GPUs – An experimental study
Efficiency of Parallelization of Neural Network Algorithm on Graphic Cards
Efficiency of the energy transfer in the Fenna-Matthews-Olson complex using hierarchical equations on graphics processing units
Efficiency without Tears: Securing Multilingual Programs with TRINITY
Efficient 2D Software Rendering
Efficient 3D Isotropic Volume Reconstruction Based On 2D Localized Ultrasound Images
Efficient 3D reconstruction of large-scale urban environments from street-level video
Efficient Acceleration of Mutual Information Computation for Nonrigid Registration using CUDA
Efficient Algorithm for RSA Text Encryption Using CUDA-C
Efficient Algorithms for Sorting on GPUs
Efficient algorithms for the realistic simulation of fluids
Efficient all-against-all protein similarity matrix computation using OpenCL
Efficient allocation of image recognition and LLM tasks on multi-GPU system
Efficient and Accurate Sound Propagation Using Adaptive Rectangular Decomposition
Efficient and Cryptographically Secure Generation of Chaotic Pseudorandom Numbers on GPU
Efficient and Good Delaunay Meshes From Random Points
Efficient and High-quality Sparse Graph Coloring on the GPU
Efficient and portable acceleration of quantum chemical many-body methods in mixed floating point precision using OpenACC compiler directives
Efficient and portable multi-tasking for heterogeneous systems
Efficient and Quality Contouring Algorithms on the GPU
Efficient and Scalable k-Means on GPUs
Efficient and Scalable Parallel Zonal Statistics on Large-Scale Species Occurrence Data on GPUs
Efficient Approaches for GEMM Acceleration on Leading AI-Optimized FPGAs
Efficient Approximate Visibility of Point Sets on the GPU
Efficient Arbitrary Precision Acceleration for Large Language Models on GPU Tensor Cores
Efficient Bayesian inference in stochastic chemical kinetic models using graphical processing units
Efficient bayesian multi-view deconvolution
Efficient Calculation of Pairwise Nonbonded Forces
Efficient Canny Edge Detection Using a GPU
Efficient code generation for hardware accelerators by refining partially specified implementation
Efficient Collision Detection and Physics-Based Deformation for Haptic Simulation with Local Spherical Hash
Efficient Communications in Training Large Scale Neural Networks
Efficient compilation of fine-grained SPMD-threaded programs for multicore CPUs
Efficient computation of condition estimates for linear least squares problems
Efficient computation of constrained parameterizations on parallel platforms
Efficient Computation of k-Nearest Neighbour Graphs for Large High-Dimensional Data Sets on GPU Clusters
Efficient Computation of SOM for Outage Database
Efficient computation of sum-products on GPUs through software-managed cache
Efficient Computation of the Kleene Star in Max-Plus Algebra using a CUDA GPU
Efficient Computational Methods for Uncertainty Quantification of Large Systems
Efficient computational noise in GLSL
Efficient Configuration of Heterogeneous Resources and Task Scheduling Strategies in Deep Learning Auto-Tuning Systems
Efficient Convex Optimization Approaches to Variational Image Fusion
Efficient Convolutional Neural Networks for Pixelwise Classification on Heterogeneous Hardware Systems
Efficient Convolutional Patch Networks for Scene Understanding
Efficient Cross-Device Query Processing
Efficient CSR-Based Sparse Matrix-Vector Multiplication on GPU
Efficient Cubic B-spline Image Interpolation on a GPU
Efficient CUDA polynomial preconditioned Conjugate Gradient solver for Finite Element computation of elasticity problems
Efficient Data Management for GPU Databases
Efficient data structures for piecewise-smooth video processing
Efficient deconvolution methods for astronomical imaging: algorithms and IDL-GPU codes
Efficient deep learning inference on end devices
Efficient Deep Neural Network Inference for Embedded Systems: A Mixture of Experts Approach
Efficient design and implementation of visual computing algorithms on the GPU
Efficient Detection of Sunspots with GPU Acceleration Through CUDA
Efficient dictionary learning implementation on the GPU using OpenCL
Efficient Discrete Range Searching primitives on the GPU with applications
Efficient Dynamic Derived Field Generation on Many-Core Architectures Using Python
Efficient Dynamic Program Monitoring on Multi-Core Platforms
Efficient Embarrassingly Parallel on Graphics Processor Unit
Efficient Emission Computation in Hidden Semi-Markov Models on Diverse Hardware
Efficient Energyminimization in Finite-Difference Micromagnetics: Speeding up Hysteresis Computations
Efficient evaluation methods of elementary functions suitable for SIMD computation
Efficient Exact Gradient Update for training Deep Networks with Very Large Sparse Targets
Efficient Execution of AMR Computations on GPU Systems
Efficient Execution of OpenMP on GPUs
Efficient Execution on GPUs of Field-Based Vehicular Mobility Models
Efficient Exploitation of Heterogeneous Platforms for Images Features Extraction
Efficient Exploitation of Heterogeneous Platforms for Vertebra Detection in X-Ray Images
Efficient fault simulation on many-core processors
Efficient FFT mapping on GPU for radar processing application: modeling and implementation
Efficient fine grained shared buffer management for multiple OpenCL devices
Efficient Finite Element Geometric Multigrid Solvers for Unstructured Grids on GPUs
Efficient floating-point texture decompression
Efficient fMRI Analysis and Clustering on GPUs
Efficient gather and scatter operations on graphics processors
Titles: 100
open PDFs: 93
packages: 22