Performance evaluation of GPU memory hierarchy using the FFT
Computer Architecture Group (GAC), University of A Coru~na (UDC), Spain
Proceedings of the 11th International Conference on Computational and Mathematical Methods in Science and Engineering (CMMSE ’11), 2011
@article{lobeiras2011performance,
title={Performance evaluation of GPU memory hierarchy using the FFT},
author={Lobeiras, Jacobo and Amor, Margarita and Doallo, Ramon},
year={2012}
}
Modern GPUs (Graphics Processing Units) are becoming more relevant in the world of HPC (High Performance Computing) thanks to their large computing power and relative low cost, however their special architecture results in more complex programming. To take advantage of their computing resources and develop efficient implementations is essential to have certain knowledge about the architecture and memory hierarchy. In this paper we use the FFT (Fast Fourier Transform) as a benchmark tool to analyze different aspects of GPU architectures, like the influence of the memory access pattern or the impact of the register pressure. The FFT is a good tool for performance analysis because it is used in many real applications that require digital signal processing and has a good balance between computational cost and memory bandwidth requirements. The work presents a comparison of two CUDA architectures to analyze the evolution of the memory hierarchy, studying which are the most efficient solutions for each case.
March 30, 2012 by hgpu