https://hgpu.org/?p=3667
Pretty Good Accuracy in Matrix Multiplication with GPUs