3004

Automatic data movement and computation mapping for multi-level parallel architectures with explicitly managed memories

Muthu Manikandan Baskaran, Uday Bondhugula, Sriram Krishnamoorthy, J. Ramanujam, Atanas Rountev, P. Sadayappan
Department of Computer Science and Engineering, The Ohio State University, 2015 Neil Ave. Columbus, OH, USA
In Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming (2008), pp. 1-10

@conference{baskaran2008automatic,

   title={Automatic data movement and computation mapping for multi-level parallel architectures with explicitly managed memories},

   author={Baskaran, M.M. and Bondhugula, U. and Krishnamoorthy, S. and Ramanujam, J. and Rountev, A. and Sadayappan, P.},

   booktitle={Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming},

   pages={1–10},

   year={2008},

   organization={ACM}

}

Download Download (PDF)   View View   Source Source   

1556

views

Several parallel architectures such as GPUs and the Cell processor have fast explicitly managed on-chip memories, in addition to slow off-chip memory. They also have very high computational power with multiple levels of parallelism. A significant challenge in programming these architectures is to effectively exploit the parallelism available in the architecture and manage the fast memories to maximize performance. In this paper we develop an approach to effective automatic data management for on-chip memories, including creation of buffers in on-chip (local) memories for holding portions of data accessed in a computational block, automatic determination of array access functions of local buffer references, and generation of code that moves data between slow off-chip memory and fast local memories. We also address the problem of mapping computation in regular programs to multi-level parallel architectures using a multi-level tiling approach, and study the impact of on-chip memory availability on the selection of tile sizes at various levels. Experimental results on a GPU demonstrate the effectiveness of the proposed approach.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: