https://hgpu.org/?p=2337
Efficient gather and scatter operations on graphics processors