hgpu.org » Triton
Jianghui Wang, Vinay Joshi, Saptarshi Majumder, Xu Chao, Bin Ding, Ziqiong Liu, Pratik Prabhanjan Brahma, Dong Li, Zicheng Liu, Emad Barsoum
Tags: AMD Radeon Instinct MI300X, ATI, Benchmarking, Code generation, Computer science, Deep learning, Package, Python, ROCm, Triton
August 3, 2025 by hgpu
Recent source codes
* * *
Most viewed papers (last 30 days)
- GPU-acceleration of the Discontinuous Galerkin Shallow Water Equations Solver (DG-SWEM) using CUDA and OpenACC
- CrossTL: A Universal Programming Language Translator with Unified Intermediate Representation
- Harnessing Batched BLAS/LAPACK Kernels on GPUs for Parallel Solutions of Block Tridiagonal Systems
- An HPC Benchmark Survey and Taxonomy for Characterization
- Home-made Diffusion Model from Scratch to Hatch
- High Performance Matrix Multiplication
- Towards Robust Agentic CUDA Kernel Benchmarking, Verification, and Optimization
- Dato: A Task-Based Programming Model for Dataflow Accelerators
- TRUST: the HPC open-source CFD platform – from CPU to GPU
- Mojo: MLIR-Based Performance-Portable HPC Science Kernels on GPUs for the Python Ecosystem
* * *