https://hgpu.org/?p=7394
A Programmable Processing Array Architecture Supporting Dynamic Task Scheduling and Module-Level Prefetching