https://hgpu.org/?p=7381
Nested Data-Parallelism on the GPU