https://hgpu.org/?p=9011
Solutions For Optimizing The Radix Sort Algorithmic Function Using The Compute Unified Device Architecture