Decoupled Access/Execute Metaprogramming for GPU-Accelerated Systems
Department of Computing, Imperial College London
Symposium on Application Accelerators in High Performance Computing, 2009 (SAAHPC’09)
@article{howes2009decoupled,
title={Decoupled Access/Execute metaprogramming for GPU-accelerated systems},
author={Howes, L. and Lokhmotov, A. and Kelly, P.H.J. and Donaldson, A.F.},
booktitle={Application Accelerators in High Performance Computing, 2009 Symposium, Papers},
year={2009}
}
We describe the evaluation of several implementations of a simple image processing filter on an NVIDIA GTX 280 card. Our experimental results show that performance depends significantly on low-level details such as data layout and iteration space mapping which complicate code development and maintenance. We propose extending a CUDA or OpenCL like model with decoupled Access/Execute (“AEcute” [1]) metadata, describing its iteration space ordering and partitioning (execute metadata) and its memory access patterns (access metadata). We believe that using AEcute metadata will make software engineering for accelerated systems more disciplined and productive, by separating algorithm representation from lowlevel mapping and tuning.
February 19, 2011 by hgpu