https://hgpu.org/?p=11984
Multireduce and Multiscan on Modern GPUs