https://hgpu.org/?p=22887
Design, Optimization, and Benchmarking of Dense Linear Algebra Algorithms on AMD GPUs