https://hgpu.org/?p=2055
Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures