hgpu.org » nVidia H100
Phuong Nguyen, Pratik Nayak, Hartwig Anzt
Tags: Computer science, CUDA, Intel, Intel Data Center GPU Max 1550, nVidia, nVidia A100, nVidia H100, Package, performance portability, Physics, SYCL
August 20, 2023 by hgpu
Recent source codes
* * *
Most viewed papers (last 30 days)
- Accurate Models of NVIDIA Tensor Cores
- The Anatomy of a Triton Attention Kernel
- Microbenchmarking NVIDIA's Blackwell Architecture: An in-depth Architectural Analysis
- KernelBand: Boosting LLM-based Kernel Optimization with a Hierarchical and Hardware-aware Multi-armed Bandit
- ProofWright: Towards Agentic Formal Verification of CUDA
- QiMeng-Kernel: Macro-Thinking Micro-Coding Paradigm for LLM-Based High-Performance GPU Kernel Generation
- Inside VOLT: Designing an Open-Source GPU Compiler
- Iris: First-Class Multi-GPU Programming Experience in Triton
- TritonForge: Profiling-Guided Framework for Automated Triton Kernel Optimization
- AIvailable: A Software-Defined Architecture for LLM-as-a-Service on Heterogeneous and Legacy GPUs
* * *




