Faster Radix Sort via Virtual Memory and Write-Combining

Jan Wassenberg, Peter Sanders
Fraunhofer IOSB
arXiv:1008.2849 [cs.DS] (17 Aug 2010)


   title={Faster Radix Sort via Virtual Memory and Write-Combining},

   author={Wassenberg, J. and Sanders, P.},

   journal={Arxiv preprint arXiv:1008.2849},



Download Download (PDF)   View View   Source Source   



Sorting algorithms are the deciding factor for the performance of common operations such as removal of duplicates or database sort-merge joins. This work focuses on 32-bit integer keys, optionally paired with a 32-bit value. We present a fast radix sorting algorithm that builds upon a microarchitecture-aware variant of counting sort. Taking advantage of virtual memory and making use of write-combining yields a per-pass throughput corresponding to at least 88 % of the system’s peak memory bandwidth. Our implementation outperforms Intel’s recently published radix sort by a factor of 1.5. It also compares favorably to the reported performance of an algorithm for Fermi GPUs when data-transfer overhead is included. These results indicate that scalar, bandwidth-sensitive sorting algorithms remain competitive on current architectures. Various other memory-intensive applications can benefit from the techniques described herein.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: