https://hgpu.org/?p=8965
Performance Upper Bound Analysis and Optimization of SGEMM on Fermi and Kepler GPUs