https://hgpu.org/?p=3456
Many-Thread Aware Prefetching Mechanisms for GPGPU Applications