https://hgpu.org/?p=3374
Efficient Discrete Range Searching primitives on the GPU with applications