Efficient compilation of fine-grained SPMD-threaded programs for multicore CPUs

John A. Stratton, Vinod Grover, Jaydeep Marathe, Bastiaan Aarts, Mike Murphy, Ziang Hu, Wen-mei W. Hwu
NVIDIA Corporation / University of Illinois at Urbana-Champaign, Champaign, IL, USA
Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization, CGO ’10, 2010


   title={Efficient compilation of fine-grained spmd-threaded programs for multicore cpus},

   author={Stratton, J.A. and Grover, V. and Marathe, J. and Aarts, B. and Murphy, M. and Hu, Z. and Hwu, W.W.},

   booktitle={Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization},





Download Download (PDF)   View View   Source Source   



In this paper we describe techniques for compiling fine-grained SPMD-threaded programs, expressed in programming models such as OpenCL or CUDA, to multicore execution platforms. Programs developed for manycore processors typically express finer thread-level parallelism than is appropriate for multicore platforms. We describe options for implementing fine-grained threading in software, and find that reasonable restrictions on the synchronization model enable significant optimizations and performance improvements over a baseline approach. We evaluate these techniques in a production-level compiler and runtime for the CUDA programming model targeting modern CPUs. Applications tested with our tool often showed performance parity with the compiled C version of the application for single-thread performance. With modest coarse-grained multithreading typical of today’s CPU architectures, an average of 3.4x speedup on 4 processors was observed across the test applications.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2021 hgpu.org

All rights belong to the respective authors

Contact us: