https://hgpu.org/?p=17025
An Efficient Multiway Mergesort for GPU Architectures