hgpu.org » ROCm
Timothée David--Cléris, Guillaume Laibe, Yona Lapeyre
Tags: AMD, AMD Radeon Instinct MI250X, Astrophysics, CUDA, MPI, nVidia, nVidia A100, OpenMP, Package, Physics, PTX, ROCm, SYCL
March 23, 2025 by hgpu
Radostin Stoyanov, Viktória Spišaková, Jesus Ramos, Steven Gurfinkel, Andrei Vagin, Adrian Reber, Wesley Armour, Rodrigo Bruno
Tags: AMD Radeon Instinct MI210, ATI, Computer science, CUDA, Deep learning, nVidia, nVidia A100, nVidia H100, nVidia RTX A6000, Package, ROCm
March 3, 2025 by hgpu
Recent source codes
* * *
Most viewed papers (last 30 days)
- A Microbenchmark Framework for Performance Evaluation of OpenMP Target Offloading
- KernelBench: Can LLMs Write Efficient GPU Kernels?
- The AI CUDA Engineer: Agentic CUDA Kernel Discovery, Optimization and Composition
- Seamless acceleration of Fortran intrinsics via AMD AI engines
- pyATF: Constraint-Based Auto-Tuning in Python
- TritonBench: Benchmarking Large Language Model Capabilities for Generating Triton Operators
- WgPy: GPU-accelerated NumPy-like array library for web browsers
- Evaluating the Performance of the DeepSeek Model in Confidential Computing Environment
- Forecasting time series with constraints
- CRIUgpu: Transparent Checkpointing of GPU-Accelerated Workloads
* * *