hgpu.org » AMD Radeon Instinct MI200
Muhammad Usman Tariq, Abhinav Jangda, Angelica Moreira, Madan Musuvathi, Tyler Sorensen
Tags: AI, AMD, AMD Radeon Instinct MI200, ATI, Computer science, CUDA, HIP, HLSL, LLM, NLP, nVidia, nVidia RTX A6000
December 29, 2025 by hgpu
Recent source codes
* * *
Most viewed papers (last 30 days)
- CUDA-L2: Surpassing cuBLAS Performance for Matrix Multiplication through Reinforcement Learning
- Accurate Models of NVIDIA Tensor Cores
- Microbenchmarking NVIDIA's Blackwell Architecture: An in-depth Architectural Analysis
- KernelBand: Boosting LLM-based Kernel Optimization with a Hierarchical and Hardware-aware Multi-armed Bandit
- QiMeng-Kernel: Macro-Thinking Micro-Coding Paradigm for LLM-Based High-Performance GPU Kernel Generation
- TritonForge: Profiling-Guided Framework for Automated Triton Kernel Optimization
- cuPilot: A Strategy-Coordinated Multi-agent Framework for CUDA Kernel Evolution
- GPU-Initiated Networking for NCCL
- Decoupled Triton: A Block-Level Decoupled Language for Writing and Exploring Efficient Machine-Learning Kernels
- ParallelKittens: Systematic and Practical Simplification of Multi-GPU AI Kernels
* * *



