hgpu.org » Network communication
Michael Mandulak, Sayan Ghosh, S M Ferdous, Mahantesh Halappanavar, George Slota
October 19, 2025 by hgpu
Zhiyi Hu, Siyuan Shen, Tommaso Bonato, Sylvain Jeaugey, Cedell Alexander, Eric Spada, Jeff Hammond, Torsten Hoefler
July 13, 2025 by hgpu
Recent source codes
* * *
Most viewed papers (last 30 days)
- The Anatomy of a Triton Attention Kernel
- Accurate Models of NVIDIA Tensor Cores
- Microbenchmarking NVIDIA's Blackwell Architecture: An in-depth Architectural Analysis
- KernelBand: Boosting LLM-based Kernel Optimization with a Hierarchical and Hardware-aware Multi-armed Bandit
- An MLIR pipeline for offloading Fortran to FPGAs via OpenMP
- QiMeng-Kernel: Macro-Thinking Micro-Coding Paradigm for LLM-Based High-Performance GPU Kernel Generation
- ProofWright: Towards Agentic Formal Verification of CUDA
- Inside VOLT: Designing an Open-Source GPU Compiler
- Iris: First-Class Multi-GPU Programming Experience in Triton
- AIvailable: A Software-Defined Architecture for LLM-as-a-Service on Heterogeneous and Legacy GPUs
* * *



