Papers on hgpu.org (.txt-file)
Efficient Approaches for GEMM Acceleration on Leading AI-Optimized FPGAs
Efficient Approximate Visibility of Point Sets on the GPU
Efficient Arbitrary Precision Acceleration for Large Language Models on GPU Tensor Cores
Efficient Bayesian inference in stochastic chemical kinetic models using graphical processing units
Efficient bayesian multi-view deconvolution
Efficient Calculation of Pairwise Nonbonded Forces
Efficient Canny Edge Detection Using a GPU
Efficient code generation for hardware accelerators by refining partially specified implementation
Efficient Collision Detection and Physics-Based Deformation for Haptic Simulation with Local Spherical Hash
Efficient Communications in Training Large Scale Neural Networks
Efficient compilation of fine-grained SPMD-threaded programs for multicore CPUs
Efficient computation of condition estimates for linear least squares problems
Efficient computation of constrained parameterizations on parallel platforms
Efficient Computation of k-Nearest Neighbour Graphs for Large High-Dimensional Data Sets on GPU Clusters
Efficient Computation of SOM for Outage Database
Efficient computation of sum-products on GPUs through software-managed cache
Efficient Computation of the Kleene Star in Max-Plus Algebra using a CUDA GPU
Efficient Computational Methods for Uncertainty Quantification of Large Systems
Efficient computational noise in GLSL
Efficient Convex Optimization Approaches to Variational Image Fusion
Efficient Convolutional Neural Networks for Pixelwise Classification on Heterogeneous Hardware Systems
Efficient Convolutional Patch Networks for Scene Understanding
Efficient Cross-Device Query Processing
Efficient CSR-Based Sparse Matrix-Vector Multiplication on GPU
Efficient Cubic B-spline Image Interpolation on a GPU
Efficient CUDA polynomial preconditioned Conjugate Gradient solver for Finite Element computation of elasticity problems
Efficient Data Management for GPU Databases
Efficient data structures for piecewise-smooth video processing
Efficient deconvolution methods for astronomical imaging: algorithms and IDL-GPU codes
Efficient Deep Neural Network Inference for Embedded Systems: A Mixture of Experts Approach
Efficient design and implementation of visual computing algorithms on the GPU
Efficient Detection of Sunspots with GPU Acceleration Through CUDA
Efficient dictionary learning implementation on the GPU using OpenCL
Efficient Discrete Range Searching primitives on the GPU with applications
Efficient Dynamic Derived Field Generation on Many-Core Architectures Using Python
Efficient Dynamic Program Monitoring on Multi-Core Platforms
Efficient Embarrassingly Parallel on Graphics Processor Unit
Efficient Emission Computation in Hidden Semi-Markov Models on Diverse Hardware
Efficient Energyminimization in Finite-Difference Micromagnetics: Speeding up Hysteresis Computations
Efficient evaluation methods of elementary functions suitable for SIMD computation
Efficient Exact Gradient Update for training Deep Networks with Very Large Sparse Targets
Efficient Execution of AMR Computations on GPU Systems
Efficient Execution of OpenMP on GPUs
Efficient Execution on GPUs of Field-Based Vehicular Mobility Models
Efficient Exploitation of Heterogeneous Platforms for Images Features Extraction
Efficient Exploitation of Heterogeneous Platforms for Vertebra Detection in X-Ray Images
Efficient fault simulation on many-core processors
Efficient FFT mapping on GPU for radar processing application: modeling and implementation
Efficient fine grained shared buffer management for multiple OpenCL devices
Efficient Finite Element Geometric Multigrid Solvers for Unstructured Grids on GPUs
Efficient floating-point texture decompression
Efficient fMRI Analysis and Clustering on GPUs
Efficient gather and scatter operations on graphics processors
Efficient Geometry Compression for GPU-based Decoding in Realtime Terrain Rendering
Efficient GPGPU-based parallel packet classification
Efficient GPU Implementation for Particle in Cell Algorithm
Efficient GPU Implementation for Single Block Orthogonal Dictionary Learning
Efficient GPU implementation of a class of array permutations
Efficient GPU implementation of a two waves WAF method for the two-dimensional one layer Shallow Water system on structured meshes
Efficient GPU implementation of parameter estimation of a statistical model for online advertisement optimization
Efficient GPU implementation of the integral histogram
Efficient GPU-Accelerated Elastic Image Registration
Efficient GPU-based Construction of Occupancy Girds Using several Laser Range-finders
Efficient GPU-based Graph Cuts for Stereo Matching
Efficient GPU-Based Texture Interpolation using Uniform B-Splines
Efficient GPU-based Training of Recurrent Neural Network Language Models Using Spliced Sentence Bunch
Efficient GPU-Implementation of Adaptive Mesh Refinement for the Shallow-Water Equations
Efficient gradient-domain compositing using quadtrees
Efficient Graph Comparison and Visualization Using GPU
Efficient Hardware Acceleration on SoC-FPGA with OpenCL
Efficient Hash Tables on the GPU
Efficient Heterogeneous Execution on Large Multicore and Accelerator Platforms: Case Study Using a Block Tridiagonal Solver
Efficient heterogeneous matrix profile on a CPU + High Performance FPGA with integrated HBM
Efficient hierarchical parallel genetic algorithms using grid computing
Efficient High-Quality Volume Rendering of SPH Data
Efficient High-Speed WPA2 Brute Force Attacks using Scalable Low-Cost FPGA Clustering
Efficient Hybrid Execution of C++ Applications using Intel(R) Xeon Phi(TM) Coprocessor
Efficient image reconstruction for point-based and line-based rendering
Efficient Implementation and Evaluation of Methods for the Estimation of Motion in Image Sequences
Efficient Implementation and Optimization of Geometric Multigrid Operations in the LIFT Framework
Efficient implementation for MD5-RC4 encryption using GPU with CUDA
Efficient implementation for QUAD stream cipher with GPUs
Efficient Implementation of Bi-directional Path Tracer on GPU
Efficient implementation of computationally intensive algorithms on parallel computing platforms
Efficient implementation of data flow graphs on multi-gpu clusters
Efficient implementation of GPGPU synchronization primitives on CPUs
Efficient Implementation of Hyperspectral Anomaly Detection Techniques on GPUs and Multicore Processors
Efficient Implementation of MrBayes on multi-GPU
Efficient implementation of multiuser precoding algorithms on GPU for MIMO-OFDM systems
Efficient Implementation of Optical Flow Algorithm Based on Directional Filters on a GPU Using CUDA
Efficient Implementation of RLS-Based Adaptive Filters on nVIDIA GeForce Graphics Processing Unit
Efficient Implementation of the CPR Formulation for the Navier-Stokes Equations on GPUs
Efficient Implementation of the eta_T Pairing on GPU
Efficient implementation of the overlap operator on multi-GPUs
Efficient Implementation of the Simplex Method on a CPU-GPU System
Efficient Incremental Text-to-Speech on GPUs
Efficient Independent Component Analysis on a GPU
Efficient Inference For Neural Machine Translation
Efficient Integral Image Computation on the GPU
Titles: 100
open PDFs: 91
packages: 19