https://hgpu.org/?p=3649
Design and implementation of software-managed caches for multicores with local memory