13221

Compiler-Level Explicit Cache for a GPGPU Programming Framework

Tomoharu Kamiya, Takanori Maruyama, Kazuhiko Ohno, Masaki Matsumoto
Department of Information Engineering, Mie University, Tsu, Mie, Japan
The 2014 International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA’14), 2014

@article{kamiya2014compiler,

   title={Compiler-Level Explicit Cache for a GPGPU Programming Framework},

   author={Kamiya, Tomoharu and Maruyama, Takanori and Ohno, Kazuhiko and Matsumoto, Masaki},

   year={2014}

}

Download Download (PDF)   View View   Source Source   

1700

views

GPU is widely used for high-performance computing. However, standard programming framework such as CUDA and OpenCL requires low-level specifications, thus programming is difficult and the performance is not portable. Therefore, we are developing a new framework named MESI-CUDA. Providing virtual shared variables accessible from both CPU and GPU, MESI-CUDA hides complex memory architecture and eliminates low-level API function calls. However, the performance of current implementation is not sufficient because of the large memory access latency. Therefore, we propose a code-optimization scheme that utilizes fast on-chip shared memories as a compiler-level explicit cache of the off-chip device memory. The compiler estimates access count/range of arrays using static analysis. For mostly reused variables, code is modified to make copy on the shared memory and access the copy, using small shared memories efficiently. As the result of evaluation, our scheme achieved 13%-192% speedup in two of three programs.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: