https://hgpu.org/?p=5677
A portable implementation of the radix sort algorithm in OpenCL