https://hgpu.org/?p=9079
Improving Performance Portability in OpenCL Programs