https://hgpu.org/?p=12422
Toward Auto-tuned Krylov Basis Computations with minimized Communication on Clusters of Accelerators