https://hgpu.org/?p=7357
Enabling and Scaling Matrix Computations on Heterogeneous Multi-Core and Multi-GPU Systems