https://hgpu.org/?p=8842
Locality-Aware Work Stealing on Multi-CPU and Multi-GPU Architectures