Papers on hgpu.org (.txt-file)
Kokkos: Enabling performance portability across manycore architectures

Krylov Subspace Accelerated Algebraic Multigrid for Mimetic Finite Differences on GPUs

KUDA: GPU Accelerated Split Race Checker

LAMDA: Learning-Assisted Multi-Stage Autotuning for FPGA Design Closure

LAMMPS’ PPPM Long-Range Solver for the Second Generation Xeon Phi

LAMMPScuda – a new GPU accelerated Molecular Dynamics Simulations Package and its Application to Ion-Conducting Glasses

Landau Gauge Fixing on GPUs and String Tension

Langevin dynamics simulations of biomolecules on graphics processors

Language Modeling with Gated Convolutional Networks

Language virtualization for heterogeneous parallel computing

Large calculation of the flow over a hypersonic vehicle using a GPU

Large data visualization on distributed memory multi-GPU clusters

Large Integer Arithmetic in GPU for Cryptography

Large Language Model Powered C-to-CUDA Code Translation: A Novel Auto-Parallelization Framework

Large neighborhood local search optimization on graphics processing units

Large scale 3D shape retrieval by exploiting multi-core and GPU

Large Scale Artificial Neural Network Training Using Multi-GPUs

Large Scale Bioinformatics Data Mining with Parallel Genetic Programming on Graphics Processing Units

Large Scale DNA Sequence Alignment and Kernel Method Implemented with GPUs

Large Scale Finite Element Analysis Using GPU Parallel Computing

Large Scale GPU Accelerated PPMLR-MHD Simulations for Space Weather Forecast

Large Scale GPU Based Simulations of Turbulent Bubbly Flow in a Square Duct

Large Scale Language Modeling: Converging on 40GB of Text in Four Hours

Large Scale Monte Carlo Tree Search on GPU

Large scale parallel state space search utilizing graphics processing units and solid state disks

Large Scale Physical Modeling Sound Synthesis

Large Scale Plane Wave Pseudopotential Density Functional Theory Calculations on GPU Clusters

Large Scale Simulations of the Euler Equations on GPU Clusters
Large Speed Increase Using Novel GPU Based Algorithms to Simulate Cardiac Excitation Waves in a Rabbit Ventricle

Large steps in GPU-based deformable bodies simulation

Large-eddy simulations with ClimateMachine: a new open-source code for atmospheric simulations on GPUs and CPUs

Large-Scale Compute-Intensive Analysis via a Combined In-Situ and Co-Scheduling Workflow Approach

Large-Scale Data Computing Performance Comparisons on SYCL Heterogeneous Parallel Processing Layer Implementations

Large-Scale Deep Learning on the YFCC100M Dataset

Large-scale deep unsupervised learning using graphics processors

Large-Scale DNS of Gas-Solid Flow on Mole-8.5

Large-scale ferrofluid simulations on graphics processing units

Large-scale FFT on GPU clusters

Large-Scale Geospatial Processing on Multi-Core and Many-Core Processors: Evaluations on CPUs, GPUs and MICs

Large-Scale High-Lundquist Number Reduced MHD Simulations of the Solar Corona Using GPU Accelerated Machines

Large-scale image analysis using docker sandboxing

Large-scale mixer simulations using massively parallel GPU architectures
Large-scale Monte Carlo simulation of two-dimensional classical XY model using multiple GPUs

Large-Scale Motion Modelling using a Graphical Processing Unit

Large-scale multi-dimensional document clustering on GPU clusters

Large-scale Nanostructure Simulations from X-ray Scattering Data On Graphics Processor Clusters

Large-scale network simulation over heterogeneous computing architecture

Large-Scale Paralleled Sparse Principal Component Analysis

Large-Scale Physics-Based Terrain Editing Using Adaptive Tiles on the GPU

Large-Scale Sound Field Rendering in Rectangular Room with Specular Reflection

Large-Scale Stereo Display Wall Using Programmable Graphics Hardware
Large-Scale Stochastic Learning using GPUs

Large-scale transient stability simulation on graphics processing units

Large-scale Virtual Acoustics Simulation at Audio Rates Using Three Dimensional Finite Difference Time Domain and Multiple GPUs

Large, Pruned or Continuous Space Language Models on a GPU for Statistical Machine Translation

Larrabee: a many-core x86 architecture for visual computing

Latency considerations of depth-first GPU ray tracing

Lattice Based Volumetric Global Illumination

Lattice Boltzmann based PDE solver on the GPU

Lattice Boltzmann Method for Simulating Turbulent Flows

Lattice Boltzmann Simulation of Binary Mixture Diffusion Using Modern Graphics Processors

Lattice Boltzmann Simulations of Multiphase Flows

Lattice Boltzmann simulations of the permeability and capillary adsorption of cement model microstructures

Lattice Boltzmann Simulations on a GPU: An optimization approach using C++ AMP

Lattice Group Models: GPU Acceleration and Numerics

Lattice QCD on new chips: a community summary

Lattice QCD simulations using the OpenACC platform

Lattice QCD with Domain Decomposition on Intel Xeon Phi Co-Processors

Lattice Quantum Chromodynamics on Intel Xeon Phi based supercomputers

Lattice Simulations using OpenACC compilers

Lattice-based flow field modeling

Lattice-Boltzmann Simulation of the Shallow-Water Equations with Fluid-Structure Interaction on Multi- and Manycore Processors

Lattice-Boltzmann simulation of the shallow-water equations with fluid-structure interaction on multi-and manycore processors

Launch-time Optimization of OpenCL Kernels

Layered Interpretation of Street View Images

LazyTensor: combining eager execution with domain-specific compilers

LBCL: multi-device automatic load balancing

LBM based flow simulation using GPU computing processor

LDetector: A Low Overhead Race Detector For GPU Programs

Leader Stochastic Gradient Descent for Distributed Training of Deep Learning Models

Learnergy: Energy-based Machine Learners

Learning a Metric Embedding for Face Recognition using the Multibatch Method

Learning Better Encoding for Approximate Nearest Neighbor Search with Dictionary Annealing

Learning Blood Management in Orthopedic Surgery through Gameplay
Learning hash codes for efficient content reuse detection

Learning Massive Graph Embeddings on a Single Machine

Learning Random Forests on the GPU

Learning Representation for Scene Understanding: Epitomes, CRFs, and CNNs

Learning Sparse Recurrent Neural Networks in Language Modeling

Learning Structured Sparsity in Deep Neural Networks

Titles: 100
open PDFs: 95
packages: 17
