A Survey of Techniques for Architecting and Managing GPU Register File
Oak Ridge National Laboratory
IEEE Transactions on Parallel and Distributed Systems (TPDS), 2016
@article{ref77,
title={A Survey of Techniques for Architecting and Managing GPU Register File},
year={2016},
author={Sparsh Mittal},
journal={IEEE Transactions on Parallel and Distributed Systems (TPDS)},
keywords={Review, Classification, GPGPU, GPU, Register file, Reliability, Performance, Power management, Non-volatile memory, Embedded DRAM, eDRAM}
}
To support their massively-multithreaded architecture, GPUs use very large register file (RF) which has a capacity higher than even L1 and L2 caches. In total contrast, traditional CPUs use tiny RF and much larger caches to optimize latency. Due to these differences, along with the crucial impact of RF in determining GPU performance, novel and intelligent techniques are required for managing GPU RF. In this paper, we survey the techniques for designing and managing GPU RF. We discuss techniques related to performance, energy and reliability aspects of RF. To emphasize the similarities and differences between the techniques, we classify them along several parameters. The aim of this paper is to synthesize the state-of-art developments in RF management and also stimulate further research in this area.
March 22, 2016 by sparsh0mittal