https://hgpu.org/?p=5803
Multi-core programming with OpenCL: performance and portability: OpenCL in a memory bound scenario