https://hgpu.org/?p=26974
Optimizing the Performance of Parallel and Concurrent Applications Based on Asynchronous Many-Task Runtimes