Papers on hgpu.org (.txt-file)
Lattice Boltzmann Simulation of Binary Mixture Diffusion Using Modern Graphics Processors
Lattice Boltzmann Simulations of Multiphase Flows
Lattice Boltzmann simulations of the permeability and capillary adsorption of cement model microstructures
Lattice Boltzmann Simulations on a GPU: An optimization approach using C++ AMP
Lattice Group Models: GPU Acceleration and Numerics
Lattice QCD on new chips: a community summary
Lattice QCD simulations using the OpenACC platform
Lattice QCD with Domain Decomposition on Intel Xeon Phi Co-Processors
Lattice Quantum Chromodynamics on Intel Xeon Phi based supercomputers
Lattice Simulations using OpenACC compilers
Lattice-based flow field modeling
Lattice-Boltzmann Simulation of the Shallow-Water Equations with Fluid-Structure Interaction on Multi- and Manycore Processors
Lattice-Boltzmann simulation of the shallow-water equations with fluid-structure interaction on multi-and manycore processors
Launch-time Optimization of OpenCL Kernels
Layered Interpretation of Street View Images
LazyTensor: combining eager execution with domain-specific compilers
LBCL: multi-device automatic load balancing
LBM based flow simulation using GPU computing processor
LDetector: A Low Overhead Race Detector For GPU Programs
Leader Stochastic Gradient Descent for Distributed Training of Deep Learning Models
Learnergy: Energy-based Machine Learners
Learning a Metric Embedding for Face Recognition using the Multibatch Method
Learning Better Encoding for Approximate Nearest Neighbor Search with Dictionary Annealing
Learning Blood Management in Orthopedic Surgery through Gameplay
Learning hash codes for efficient content reuse detection
Learning Massive Graph Embeddings on a Single Machine
Learning Random Forests on the GPU
Learning Representation for Scene Understanding: Epitomes, CRFs, and CNNs
Learning Sparse Recurrent Neural Networks in Language Modeling
Learning Structured Sparsity in Deep Neural Networks
Learning to Detect Roads in High-Resolution Aerial Images
Learning to Optimize Tensor Programs
Learning Two-View Stereo Matching
Least Squares on GPUs in Multiple Double Precision
Lectures on Parallel Computing
LeFlow: Enabling Flexible FPGA High-Level Synthesis of Tensorflow Deep Neural Networks
LeftoverLocals: Listening to LLM Responses Through Leaked GPU Local Memory
Legion: Programming Distributed Heterogeneous Architectures with Logical Regions
Legolizer: A Real-Time System for Modeling and Rendering LEGO Representations of Boundary Models
Lensed: a code for the forward reconstruction of lenses and sources from strong lensing observations
Leo: A Profile-Driven Dynamic Optimization Framework for GPU Applications
Lessons learned from contrasting a BLAS kernel implementations
Lessons learned in a decade of research software engineering GPU applications
Lessons Learned Migrating CUDA to SYCL: A HEP Case Study with ROOT RDataFrame
Let’s sort this out: GPGPU Verification of Radix Sort
Lettuce: PyTorch-based Lattice Boltzmann Framework
Level Sets and Voronoi based Feature Extraction from any Imagery
Level-of-Detail Triangle Strips for Deforming Meshes
Leveraging Binary Translation for Heterogeneous Profiling
Leveraging Computation Sharing and Parallel Processing in Location-Based Services
Leveraging Data-Flow Information for Efficient Scheduling of Task-Parallel Programs on Heterogeneous Systems
Leveraging Memory Copy Overlap for Efficient Sparse Matrix-Vector Multiplication on GPUs
Leveraging on High-Performance Computing and Cloud Technologies in Digital Libraries: A Case Study
Leveraging Parallelism with CUDA and OpenCL
Levy Flights for Particle Swarm Optimisation Algorithms on Graphical Processing Units
LeXInt: GPU-accelerated Exponential Integrators package
libcloudph++ 0.1: single-moment bulk, double-moment bulk, and particle-based warm-rain microphysics library in C++
libCudaOptimize: an Open Source Library of GPU-based Metaheuristics
libhclooc: Software Library Facilitating Out-of-core Implementations of Accelerator Kernels on Hybrid Computing Platforms
libmolgrid: GPU Accelerated Molecular Gridding for Deep Learning Applications
libWater: Heterogeneous Distributed Computing Made Easy
Light Loss-Less Data Compression, with GPU Implementation
Light propagation for mixed polygonal and volumetric data
Light Propagation Maps on Parallel Graphics Architectures
Lighting Details Preserving Photon Density Estimation
LightNet: A Versatile, Standalone Matlab-based Environment for Deep Learning
Lightning: Scaling the GPU Programming Model Beyond a Single GPU
LightPlay: Efficient Replay with GPUs
LightRNN: Memory and Computation-Efficient Recurrent Neural Networks
LightScan: Faster Scan Primitive on CUDA Compatible Manycore Processors
Lightweight bleeding and smoke effect for surgical simulators
Lightweight Modular Staging and Embedded Compilers: Abstraction Without Regret for High-Level High-Performance Programming
Lightweight modular staging: a pragmatic approach to runtime code generation and compiled DSLs
Lina: a fast design optimisation tool for software-based FPGA programming
linalg: Matrix Computations in Apache Spark
Line-art Illustration of Dynamic and Specular Surfaces
Linear Algebra Algorithms for Hybrid Architectures with XKaapi
Linear algebra operators for GPU implementation of numerical algorithms
Linear Feature Detection on GPUs
Linear genetic programming GPGPU on Microsoft’s Xbox 360
Linear optimization on modern GPUs
Linear Performance-Breakdown Model: A Framework for GPU kernel programs performance analysis
Linear Solvers for Stable Fluids: GPU vs CPU
Linearised inversion with GPUs
Linpack evaluation on a supercomputer with heterogeneous accelerators
linus: Conveniently explore, share, and present large-scale biological trajectory data from a web browser
liquidSVM: A Fast and Versatile SVM package
Liszt: A Domain Specific Language for Building Portable Mesh-based PDE Solvers
Literature Review and Implementation Overview: High Performance Computing with Graphics Processing Units for Classroom and Research Use
Literature review: Build and Travel KD-Tree with CUDA
Literature Review: Parallel Computing on linear equations of linear elastic FEM stimulation with CUDA
Titles: 100
open PDFs: 96
packages: 28