5096

The Sharing Tracker: Using Ideas from Cache Coherence Hardware to Reduce Off-Chip Memory Traffic with Non-Coherent Caches

David Tarjan, Kevin Skadron
Department of Computer Science, University of Virginia, Charlottesville, VA 22904, USA
International Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2010

@article{tarjan2010sharing,

   title={The Sharing Tracker: Using Ideas from Cache Coherence Hardware to Reduce Off-Chip Memory Traffic with Non-Coherent Caches},

   author={Tarjan, D. and Skadron, K.},

   journal={sc},

   pages={1–10},

   year={2010},

   publisher={IEEE Computer Society}

}

Download Download (PDF)   View View   Source Source   

1356

views

Graphics Processing Units (GPUs) have recently emerged as a new platform for high performance, general-purpose computing. Because current GPUs employ deep multithreading to hide latency, they only have small, per-core caches to capture reuse and eliminate unnecessary off-chip accesses. This paper shows that for general-purpose workloads, the ability to copy cache lines between private caches captures inter-core temporal locality and provides substantial reductions in off-chip bandwidth requirements. Unlike hardware cache coherence, a sharing tracker only needs to track cache lines in the private caches imprecisely, because it is only a performance hint. This simplifies the implementation and is so effective at capturing inter-core reuse that the L2 can be eliminated entirely. The sharing tracker is motivated by but not specific to the GPU and could be used in other manycore organizations.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: