hgpu.org » AMD Radeon Vega VII
Nigel Tan
Tags: AMD Radeon Vega VII, ATI, Computer science, CUDA, Heterogeneous systems, HPC, nVidia, nVidia A100, nVidia DGX-A100, nVidia Quadro RTX 5000, OpenMP, OpenMPI, Particle-in-cell methods, performance portability, Tesla V100, Thesis
September 15, 2024 by hgpu
Recent source codes
* * *
Most viewed papers (last 30 days)
- CudaForge: An Agent Framework with Hardware Feedback for CUDA Kernel Optimization
- Scalable GPU-Based Integrity Verification for Large Machine Learning Models
- STARK: Strategic Team of Agents for Refining Kernels
- Tutoring LLM into a Better CUDA Optimizer
- INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats
- Collective Communication for 100k+ GPUs
- An MLIR pipeline for offloading Fortran to FPGAs via OpenMP
- Enhancing Transformer Performance and Portability through Auto-tuning Frameworks
- RDMA Point-to-Point Communication for LLM Systems
- A Study of Floating-Point Precision Tuning in Deep Learning Operators Implementations
* * *



