Performance Evaluation of Blocking and NonBlocking Concurrent Queues on GPUs
University of Mississippi
University of Mississippi, 2019
@article{pourmeidani2019performance,
title={Performance Evaluation of Blocking and Non-Blocking Concurrent Queues on GPUs},
author={Pourmeidani, Hossein},
year={2019}
}
The efficiency of concurrent data structures is crucial to the performance of multithreaded programs in shared-memory systems. The arbitrary execution of concurrent threads, however, can result in an incorrect behavior of these data structures. Graphics Processing Units (GPUs) have appeared as a powerful platform for high-performance computing. As regular data-parallel computations are straightforward to implement on traditional CPU architectures, it is challenging to implement them in a SIMD environment in the presence of thousands of active threads on GPU architectures. In this thesis, we implement a concurrent queue data structure and evaluate its performance on GPUs to understand how it behaves in a massively-parallel GPU environment. We implement both blocking and non-blocking approaches and compare their performance and behavior using both micro-benchmark and real-world application. We provide a complete evaluation and analysis of our implementations on an AMD Radeon R7 GPU. Our experiment shows that non-blocking approach outperforms blocking approach by up to 15.1 times when sufficient thread-level parallelism is present.
October 13, 2019 by hgpu