hgpu.org » Kokkos
Performant Unified GPU Kernels for Portable Singular Value Computation Across Hardware and Precision
Evelyne Ringoot, Rabab Alomairy, Valentin Churavy, Alan Edelman
Tags: AMD Radeon Instinct MI250, Apple M1 Pro, ATI, Computer science, HIP, Intel, Intel Ponte Vecchio Max 1100, Kokkos, Linear Algebra, Machine learning, nVidia, nVidia A100, nVidia GeForce RTX 4060, nVidia H100, OpenCL, SYCL
August 17, 2025 by hgpu
Kim Liegeois, Brian Kelley, Eric Phipps, Sivasankaran Rajamanickam, Vassil Vassilev
Tags: AMD Radeon Instinct MI300X, ATI, Computer science, CUDA, HIP, Kokkos, Machine learning, Mathematical Software, nVidia, nVidia H100
August 3, 2025 by hgpu
Recent source codes
* * *
Most viewed papers (last 30 days)
- An HPC Benchmark Survey and Taxonomy for Characterization
- Home-made Diffusion Model from Scratch to Hatch
- High Performance Matrix Multiplication
- Towards Robust Agentic CUDA Kernel Benchmarking, Verification, and Optimization
- Dato: A Task-Based Programming Model for Dataflow Accelerators
- TRUST: the HPC open-source CFD platform – from CPU to GPU
- Mojo: MLIR-Based Performance-Portable HPC Science Kernels on GPUs for the Python Ecosystem
- Towards Calculating HPC CUDA Kernel Performance on Nvidia GPUs
- Combining Performance and Productivity: Accelerating the Network Sensing Graph Challenge with GPUs and Commodity Data Science Software
- Towards GPU Parallelism Abstractions in Rust: A Case Study with Linear Pipelines
* * *