https://hgpu.org/?p=17099
Pipelined MapReduce: A Decoupled MapReduce RunTime for Shared Memory Multi-Processors