Papers on hgpu.org (.txt-file)
Adaptive Simulation of Large-Scale Ocean Surface

Adaptive SpMV/SpMSpV on GPUs for Input Vectors of Varied Sparsity

Adaptive Task Size Control on High Level Programming for GPU/CPU Work Sharing

Adaptive Treelet Meshes for Efficient Streak-Surface Visualization on the GPU

Adaptive Video Encoding Based on OpenCL Face Recognition

Adaptive Work-Efficient Connected Components on the GPU

Adaptive, real-time visual simultaneous localization and mapping

Adaptivity in AdaptiveCpp: Optimizing Performance by Leveraging Runtime Information During JIT-Compilation

Adding fault tolerance to OpenCL: Through redundant heterogeneous computing

Adding GPU Computing to Computer Organization Courses

Adding special-purpose processor support to the Erlang VM

Address Selection for Efficient Barriers on the Intel Xeon Phi

Addressing Challenges in Utilizing GPUs for Accelerating Privacy-Preserving Computation

ADHA: Automatic Data layout framework for Heterogeneous Architectures

Adhoc On-Demand Distance Vector Protocol For Energy Efficiency

Adjoint Algorithmic Differentiation of a GPU Accelerated Application

Adjoint Lattice Boltzmann for Topology Optimization on multi-GPU architecture

Adjustable GPU Acceleration for Hermitian Eigensystems

ADMM-NN: An Algorithm-Hardware Co-Design Framework of DNNs Using Alternating Direction Method of Multipliers

Advanced 2D Rasterization on Modern CPUs

Advanced Architectures for Astrophysical Supercomputing

Advanced CFD Modeling Using GeForce GPUs

Advanced Concurrency Control Algorithm Design and GPU System Support for High Performance In-Memory Data Management

Advanced illumination techniques for GPU volume raycasting
Advanced MRI reconstruction toolbox with accelerating on GPU

Advanced Multi-Frame Rate Rendering Techniques

Advanced Optimization Techniques for Sparse Grids on Modern Heterogeneous Systems

Advanced Optimizations of An Implicit Navier-Stokes Solver on GPGPU

Advanced Programming Platform for efficient use of Data Parallel Hardware

Advanced Simulation Library: Expanding software ecosystem for the DSP/FPGA/GPU market

Advanced Techniques for the Rendering and Visualization of Volumetric Seismic Data

Advanced Trends of Heterogeneous Computing with CPU-GPU Integration: Comparative Study

Advanced ultrasound beam forming using GPGPU technology

Advanced Video Coding on CPUs and GPUs: Parallelization and RD Analysis

Advances in Electron Microscopy with Deep Learning

Advances in Semantic Patching for HPC-oriented Refactorings with Coccinelle

Advancing Large Scale Many-Body QMC Simulations on GPU Accelerated Multicore Systems

Advancing the distributed Multi-GPU ChASE library through algorithm optimization and NCCL library

Advantages and GPU implementation of high-performance indexed DNA search based on suffix arrays

Adventures in the microlensing cloud: large datasets, eResearch tools, and GPUs

ADWPNAS: Architecture-Driven Weight Prediction for Neural Architecture Search

AeminiumGPU: A CPU-GPU Hybrid Runtime for the Aeminium Language

AeminiumGPU: An Intelligent Framework for GPU Programming

Aeolian Sand Movement and Interacting with Vegetation: A GPU Based Simulation and Visualization Method

AES Algorithm Adapted on GPU Using CUDA for Small Data and Large Data Volume Encryption

AES and DES Encryption with GPU

AES Encryption Algorithm Based on the High Performance Computing of GPU
AES Encryption and Decryption Using Direct3D 10 API

AES Encryption Implementation and Analysis on Commodity Graphics Processing Units

AES Encryption Implementation on CUDA GPU and Its Analysis
AES encryption on modern consumer architectures

AES finalists implementation for GPU and multi-core CPU based on OpenCL
AES on GPU: a CUDA Implementation
Affine Vector Cache for memory bandwidth savings

AFiD-GPU: a versatile Navier-Stokes Solver for Wall-Bounded Turbulent Flows on GPU Clusters

AFOCL: Portable OpenCL Programming of FPGAs via Automated Built-in Kernel Management

Age and Gender Classification using Convolutional Neural Networks

Ageing at the Spin-Glass/Ferromagnet Transition: Monte Carlo Simulation using GPUs

Agent-based crowd simulation using GPU computing

Agent-Based Modeling on High Performance Computing Architectures

AGFT: An Adaptive GPU Frequency Tuner for Real-Time LLM Inference Optimization

Aggregate Gaze Visualization with Real-time Heatmaps

Aging in the three-dimensional Random Field Ising Model

AI Benchmark: All About Deep Learning on Smartphones in 2019

AI Benchmark: Running Deep Neural Networks on Android Smartphones

AI Factories: It’s time to rethink the Cloud-HPC divide

AIPerf: Automated machine learning as an AI-HPC benchmark

Air pollution modelling using a graphics processing unit with CUDA

Airborne Downward Looking Sparse Linear Array 3-D SAR Heterogeneous Parallel Simulation

Airborne radar clutter simulation using GPU (CUDA)

AIvailable: A Software-Defined Architecture for LLM-as-a-Service on Heterogeneous and Legacy GPUs

Akid: A Library for Neural Network Research and Production from a Dataism Approach

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

Algebraic 3D Reconstruction of Planetary Nebulae

Algebraic Splats Representation for Point Based Models

Algorithm 9xx: Sparse QR Factorization on the GPU

Algorithm Acceleration from GPGPUs for the ATLAS Upgrade

Algorithm and implementation of multi-channel spike sorting using GPU in a home-care surveillance system

Algorithm Construction for GPGPU

Algorithm for Sparse Approximate Inverse Preconditioners in the Conjugate Gradient Method

Algorithmic and Software System Support to Accelerate Data Processing in CPU-GPU Hybrid Computing Environments

Algorithmic Contributions to the Theory of Regular Chains

Algorithmic Differentiation: Application to Variational Problems in Computer Vision

Algorithmic GPGPU Memory Optimization

Algorithmic performance studies on graphics processing units

Algorithmic Skeleton Framework for the Orchestration of GPU Computations

Algorithmic Trading: A brief, computational finance case study on data centre FPGAs

Algorithms acceleration of pattern-matching in multi-core architectures

Algorithms and Data Structures for Interactive Ray Tracing on Commodity Hardware

Algorithms and Heuristics for Scalable Betweenness Centrality Computation on Multi-GPU Systems

Algorithms for Compression on GPUs

Algorithms for Large-Scale Power Delivery Network Analysis on Massively Parallel Architectures

Algorithms for manipulating large geometric data

Algorithms for Rapid Characterization and Optimization of Aperture and Reflector Antennas

Algorithms for representation of 3D regions in radiotherapy planning software

Algorithms for Solving Non-Stationary Heat Conduction Problem for Design of a Technical Device

Algorithms for the mapping of genome sequences in GPGPU

Titles: 100
open PDFs: 93
packages: 19
