https://hgpu.org/?p=12476
Combining Data Parallelism and Task Parallelism for Efficient Performance on Hybrid CPU and GPU Systems