AQsort: Scalable Multi-Array In-Place Sorting with OpenMP
Department of Computer Systems, Faculty of Information Technology, Czech Technical University in Prague, Thakurova 9, 160 00, Praha, Czech Republic
Scalable Computing: Practice and Experience, Volume 17, Number 4, pp. 369-391, 2016
@article{langr2016aqsort,
title={AQsort: Scalable Multi-Array In-Place Sorting with OpenMP},
author={Langr, Daniel and Tvrdik, Pavel and Simecek, Ivan},
journal={Scalable Computing: Practice and Experience},
volume={17},
number={4},
pages={369–391},
year={2016}
}
A new multi-threaded variant of the quicksort algorithm called AQsort and its C++/OpenMP implementation are presented. AQsort operates in place and was primarily designed for high-performance computing (HPC) runtime environments. It can work with multiple arrays at once; such a functionality is frequently required in HPC and cannot be accomplished with standard C pointer-based or C++ iterator-based approach. An extensive study is provided that evaluates AQsort experimentally and compares its performance with modern multi-threaded implementations of in-place and out-of-place sorting algorithms based on OpenMP, Cilk Plus, and Intel TBB. The measurements were conducted on several leading-edge HPC architectures, namely Cray XE6 nodes with AMD Bulldozer CPUs, Cray XC40 nodes with Intel Hasswell CPUs, IBM BlueGene/Q nodes, and Intel Xeon Phi coprocessors. The obtained results show that AQsort provides good scalability and sorting performance generally comparable to its competitors. In particular cases, the performance of AQsort may be slightly lower, which is the price for its universality and ability to work with substantially larger amounts of data.
October 29, 2016 by hgpu