27077

COX: Exposing CUDA Warp-Level Functions to CPUs

Ruobing Han, Jaewon Lee, Jaewoong Sim, Hyesoon Kim
Georgia Institute of Technology, USA
ACM Transactions on Architecture and Code Optimization, 2022

@article{han2022cox,

   title={COX: Exposing CUDA Warp-Level Functions to CPUs},

   author={Han, Ruobing and Lee, Jaewon and Sim, Jaewoong and Kim, Hyesoon},

   journal={ACM Transactions on Architecture and Code Optimization (TACO)},

   year={2022},

   publisher={ACM New York, NY}

}

Download Download (PDF)   View View   Source Source   

845

views

As CUDA becomes the de facto programming language among data parallel applications such as high-performance computing or machine learning applications, running CUDA on other platforms becomes a compelling option. Although several efforts have attempted to support CUDA on devices other than NVIDIA GPUs, due to extra steps in the translation, the support is always a few years behind CUDA’s latest features. In particular, the new CUDA programming model exposes the warp concept in the programming language, which greatly changes the way the CUDA code should be mapped to CPU programs. In this paper, hierarchical collapsing that correctly supports CUDA warp-level functions on CPUs is proposed. To verify hierarchical collapsing, we build a framework, COX, that supports executing CUDA source code on the CPU backend. With hierarchical collapsing, 90% of kernels in CUDA SDK samples can be executed on CPUs, much higher than previous works (68%). We also evaluate the performance with benchmarks for real applications, and show that hierarchical collapsing can generate CPU programs with comparable or even higher performance than previous projects in general.
Rating: 4.5/5. From 2 votes.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: