5261

Efficient implementation of GPGPU synchronization primitives on CPUs

Jayanth Gummaraju, Ben Sander, Laurent Morichetti, Benedict Gaster, Lee Howes
Advanced Micro Devices, Sunnyvale, CA, USA
Proceedings of the 7th ACM international conference on Computing frontiers, CF ’10, 2010

@inproceedings{gummaraju2010efficient,

   title={Efficient implementation of GPGPU synchronization primitives on CPUs},

   author={Gummaraju, J. and Sander, B. and Morichetti, L. and Gaster, B. and Howes, L.},

   booktitle={Proceedings of the 7th ACM international conference on Computing frontiers},

   pages={85–86},

   year={2010},

   organization={ACM}

}

Source Source   

1387

views

The GPGPU model represents a style of execution where thousands of threads execute in a data-parallel fashion, with a large subset (typically 10s to 100s) needing frequent synchronization. As the GPGPU model evolves target both GPUs and CPUs as acceleration targets, thread synchronization becomes an important problem when running on CPUs. CPUs have little hardware support for synchronization and must be emulated in software, reducing application performance. This paper presents software techniques to implement the GPGPU synchronization primitives on CPUs, while maintaining application debug-ability. Performing limit studies using real hardware, we evaluate the potential performance benefits of an efficient barrier primitive.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: