https://hgpu.org/?p=8085
An OpenCL Method of Parallel Sorting Algorithms for GPU Architecture