https://hgpu.org/?p=11787
Experiments with Massively Parallel Matrix Multiplication