CuNesl: Compiling Nested Data-Parallel Languages for SIMT Architectures

Yongpeng Zhang, Frank Mueller
North Carolina State University, Raleigh, NC, USA
International Conference on Parallel Processing (ICPP’12), 2012


   title={CuNesl: Compiling Nested Data-Parallel Languages for SIMT Architectures},

   author={Zhang, Y. and Mueller, F.},



Download Download (PDF)   View View   Source Source   



Data-parallel languages feature fine-grained parallel primitives that can be supported by compilers targeting modern many-core architectures where data parallelism must be exploited to fully utilize the hardware. Previous research has focused on converting data-parallel languages for SIMD (single instruction multiple data) architectures. However, directly applying them to today’s SIMT (single instruction multiple thread) architectures does not guarantee competitive performance. We propose cuNesl, a compiler framework to translate and optimize NESL into parallel CUDA programs for SIMT architectures. By converting recursive calls into while loops, we ensure that the hierarchical execution model in GPUs can be exploited on the "flattened" code. The performance gap between our auto-generated CUDA code and hand-crafted CUDA code thus narrows while programmability is greatly increased. Our compiler outperforms handwritten parallel code running on CPUs in terms of both execution time and programmability.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2021 hgpu.org

All rights belong to the respective authors

Contact us: