https://hgpu.org/?p=12036
Fine-grain Task Aggregation and Coordination on GPUs