16575

Tuning Stencil Codes in OpenCL for FPGAs

Qi Jia, Huiyang Zhou
Dept. of Electrical and Computer Engineering, North Carolina State University, Raleigh, North Carolina
34th IEEE International Conference on Computer Design (ICCD-2016), 2016

@article{jia2016tuning,

   title={Tuning Stencil Codes in OpenCL for FPGAs},

   author={Jia, Qi and Zhou, Huiyang},

   year={2016}

}

OpenCL is designed as a parallel programming framework to support heterogeneous computing platforms. The implicit or explicit parallelism in OpenCL kernel code enables efficient FPGA implementation from a high-level programming abstraction. However, FPGA architecture is completely different from GPU architecture, for which OpenCL is widely used. Tuning OpenCL codes to achieve high performance on FPGAs is an open problem and the existing OpenCL tools and optimizations proposed for CPUs/GPUs may not be directly applicable to FPGAs. In this paper, we explore OpenCL code optimizations for stencil computations on FPGAs. We propose tuning processes for stencil kernels in both the Single-Task and NDRange modes. Our optimized 1D convolution, 2D convolution and 2D Jacobi iteration kernels can achieve up to two orders of magnitude performance improvement over the naive kernels. Also, compared to Altera design examples our optimized kernels achieve 7.1x and 3.5x speedups for the Sobel and Time-Domain FIR Filter, respectively. This study also includes benchmarking of the FPGA memory system, revealing how code patterns affect the performance of different types of memory on FPGAs.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: