High-performance CUDA kernel execution on FPGAs

Alexandros Papakonstantinou, Karthik Gururaj, John A. Stratton, Deming Chen, Jason Cong, Wen Mei
Electrical & Computer Engineering Dept., University of Illinois, Urbana-Champaign, IL, USA
In ICS ’09: Proceedings of the 23rd international conference on Supercomputing (2009), pp. 515-516.


   title={High-performance CUDA kernel execution on FPGAs},

   author={Papakonstantinou, A. and Gururaj, K. and Stratton, J.A. and Chen, D. and Cong, J. and Hwu, W.M.W.},

   booktitle={Proceedings of the 23rd international conference on Supercomputing},





Download Download (PDF)   View View   Source Source   



In this work, we propose a new FPGA design flow that combines the CUDA programming model from Nvidia with the state of the art high-level synthesis tool AutoPilot from AutoESL, to efficiently map the exposed parallelism in CUDA kernels onto reconfigurable devices. The use of the CUDA programming model offers the advantage of a common programming interface for exploiting parallelism on two very different types of accelerators — FPGAs and GPUs. Moreover, by leveraging the advanced synthesis capabilities of AutoPilot we enable efficient exploitation of the FPGA configurability for application specific acceleration. Our flow is based on a compilation process that transforms the SPMD CUDA thread blocks into high-concurrency AutoPilot-C code. We provide an overview of our CUDA-to-FPGA flow and demonstrate the highly competitive performance of the generated multi-core accelerators.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2021 hgpu.org

All rights belong to the respective authors

Contact us: