A Code Optimization Framework for Performance Portability of GPU Kernels onto Custom Accelerators

Alexandros Papakonstantinou, Deming Chen, Wen-mei Hwu
Electrical and Computer Engineering Department, University of Illinois at Urbana-Champaign
Univ. of Illinois/Urbana-Champaign, 2011


   title={A Code Optimization Framework for Performance Portability of GPU Kernels onto Custom Accelerators},

   author={Papakonstantinou, Alexandros and Chen, Deming and Hwu, Wen-mei},



Download Download (PDF)   View View   Source Source   



The shift toward parallel computing has resulted into a growing interest in computing systems with heterogeneous processing modules. Reconfigurable devices are often employed in such heterogeneous systems due to their low power and parallel processing benefits. An important issue in the programmability of these systems is the need for a single programming interface. Recent works have leveraged parallel programming models in tandem with high-level synthesis (HLS) to facilitate high abstraction parallel programming of FPGAs. Nevertheless, generating efficient custom hardware accelerators depends on the structure of the parallel input code. Code optimized for programmable multicore devices (e.g. GPUs or CPUs) may result in low-performance custom accelerators. In this work the researchers describe a code optimization framework which analyzes and restructures CUDA kernels that were optimized for GPU devices in order to facilitate synthesis of efficient custom accelerators on FPGA. Their experimental results show that the proposed framework can achieve good performance portability.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: