https://hgpu.org/?p=28703
Performance portability evaluation of blocked stencil computations on GPUs