https://hgpu.org/?p=12735
Performance Portability Study of Linear Algebra Kernels in OpenCL