Understanding and Modeling the Synchronization Cost in the GPU Architecture

James T. Letendre
Rochester Institute of Technology, Kate Gleason College of Engineering
Rochester Institute of Technology, 2013


   title={Understanding and Modeling the Synchronization Cost in the GPU Architecture},

   author={Letendre, James T},


   school={Rochester Institute of Technology}


Download Download (PDF)   View View   Source Source   



Graphic Processing Units (GPUs) have been growing more and more popular being used for general purpose computations. GPUs are massively parallel processors which make them a much more ideal fit for many algorithms than the CPU is. The drawback to using a GPU to do a computation is that they are much less efficient at running algorithms with more complex control flow. This has led to them being used as part of a heterogeneous system, usually consisting of a CPU and a GPU although other types of processors could be added. Models of GPUs are important in order to determine how well your code will perform on various different GPUs, especially those which the programmer does not have access to. GPU prices range from $100s to $2000s and more, so when designing a system with a particular performance value in mind, it is beneficial to be able to determine which GPU best meets your goal without wasting money on unneeded performance. Current GPU models were either developed for older generations of GPU architectures, they ignore certain costs that are present in the GPU, or when they account for those costs, they do so inaccurately. The big component that is ignored in most of the models investigated is the synchronization cost. This cost arises when the various threads within the GPU need to share data amongst themselves. In order to ensure that the data shared is accurate, the threads must synchronize so that they have all written to memory before any thread tries to read. It is also the cause of major inaccuracies with the most up to date GPU model found. This thesis aims to understand the factors of the synchronization cost through the use of microbenchmarks. With this understanding the accuracy of the model can be improved.
Rating: 2.5/5. From 1 vote.
Please wait...

* * *

* * *

HGPU group © 2010-2021 hgpu.org

All rights belong to the respective authors

Contact us: