https://hgpu.org/?p=9305
Improving Numerical Accuracy for Non-Negative Matrix Multiplication on GPUs using Recursive Algorithms