https://hgpu.org/?p=12398
Evaluation of DGEMM Implementation on Intel Xeon Phi Coprocessor