Inter-Warp Instruction Temporal Locality in Deep-Multithreaded GPUs
School of Electrical and Computer Engineering, University College of Engineering, University of Tehran, Tehran, Iran
26th International Conference on Architecture of Computing Systems (ARCS 2013), 2013
GPUs employ thousands of threads per core to achieve high throughput. These threads exhibit localities in control-flow, instruction and data addresses and values. In this study we investigate inter-warp instruction temporal locality and show that during short intervals a significant share of fetched instructions are fetched unnecessarily. This observation provides several opportunities to enhance GPUs. We discuss different possibilities and evaluate filter cache as a case study. Moreover, we investigate how variations in microarchitectural parameters impacts potential filter cache benefits in GPUs.
January 17, 2013 by hgpu