https://hgpu.org/?p=13064
Locality-Aware Mapping of Nested Parallel Patterns on GPUs