https://hgpu.org/?p=18435
Performance Evaluation and Tuning of An OpenCL based Matrix Multiplier