https://hgpu.org/?p=13633
Converting Data-Parallelism to Task-Parallelism by Rewrites: Purely Functional Programs Across Multiple GPUs