10330

Performance Drawbacks for Matrix Multiplication using Set Associative Cache in GPU devices

Leonid Djinevski, Sime Arsenovski, Sasko Ristov, Marjan Gusev
FON University, 1000 Skopje, Macedonia
36th International Convention MIPRO, 2013

@article{djinevski2013performance,

   title={Performance Drawbacks for Matrix Multiplication using Set Associative Cache in GPU devices},

   author={Djinevski, Leonid and Arsenovski, Sime and Ristov, Sasko and Gusev, Marjan},

   year={2013}

}

Download Download (PDF)   View View   Source Source   

2980

views

Performance of shared memory processors show negative performance impulses (drawbacks) in certain regions for execution of the basic matrix multiplication algorithm. In this paper we continue with analysis of GPU memory hierarchy and corresponding cache memory organization. We give a theoretical analysis why a negative performance impulse appears for specifics problem sizes. The main reason is the cache storage organization, i.e. the negative performance peak appears caused by mapping of matrix elements onto one cache set, instead of using the whole cache. The obtained experimental results prove our theoretical analysis. We also propose a method to avoid situations where performance drawbacks appear.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: