https://hgpu.org/?p=2413
Evaluation and tuning of the Level 3 CUBLAS for graphics processors