hgpu.org » ROCm
Radostin Stoyanov, Viktória Spišaková, Jesus Ramos, Steven Gurfinkel, Andrei Vagin, Adrian Reber, Wesley Armour, Rodrigo Bruno
Tags: AMD Radeon Instinct MI210, ATI, Computer science, CUDA, Deep learning, nVidia, nVidia A100, nVidia H100, nVidia RTX A6000, Package, ROCm
March 3, 2025 by hgpu
Recent source codes
* * *
Most viewed papers (last 30 days)
- KernelBench: Can LLMs Write Efficient GPU Kernels?
- A Microbenchmark Framework for Performance Evaluation of OpenMP Target Offloading
- Seamless acceleration of Fortran intrinsics via AMD AI engines
- The AI CUDA Engineer: Agentic CUDA Kernel Discovery, Optimization and Composition
- pyATF: Constraint-Based Auto-Tuning in Python
- TritonBench: Benchmarking Large Language Model Capabilities for Generating Triton Operators
- WgPy: GPU-accelerated NumPy-like array library for web browsers
- Evaluating the Performance of the DeepSeek Model in Confidential Computing Environment
- Forecasting time series with constraints
- CRIUgpu: Transparent Checkpointing of GPU-Accelerated Workloads
* * *