20597

Design Space Exploration of an OpenCL Based SAXPY Kernel Implementation on FPGAs

Jannatun Naher, Clay Gloster, Shrikant S. Jadhav, Christopher C. Doss
Electrical and Computer Engineering Department, North Carolina A&T State University, Greensboro, NC, USA
North Carolina A&T State University, 2020

@article{gloster2020design,

   title={Design Space Exploration of an OpenCL Based SAXPY Kernel Implementation on FPGAs},

   author={Gloster, Clay and Naher, Jannatun and Doss, Christopher C and Jadhav, Shrikant S},

   year={2020}

}

Download Download (PDF)   View View   Source Source   

350

views

High-performance computing researchers are trying to find new options, tools to satisfy the performance criteria of a hardware design. FPGA (Field Programmable Gate Array) is one of the accelerators which is widely used for power-efficient applications due to its reconfigurability and high performance. Traditionally FPGA can be programmed using Hardware Description Language (HDL). Using HDL, for any FPGA hardware architecture design, the designer needs to be very knowledgeable about the hardware and the Register Transfer Level Language (RTL) programming. While designing hardware architecture, it was always desired to reduce design complexity and developing time. FPGA can be programmed by a software programmer using High-Level Synthesis (HLS) tools like OpenCL, Vivado while avoiding design complexity and reducing the developing time. OpenCL is an HLS tool where a designer can write the code like the software and design the hardware. An OpenCL design can be done using many data partitions and task parallelism techniques. The SAXPY, a level 1 Basic Linear Algebra (BLAS) routine is widely used for many scientific applications. This level 1, BLAS routine which involves vector-vector operations, can be implemented in various ways like using the n-dimensional range (NDRange) or single work item, using global memory or local memory, choosing the option to reuse the local storage and tuning the design knobs such as block size, work item, bank width, number of memory banks and loop unrolling factor. This paper is presenting a Design Space Exploration (DSE) of OpenCL implementation for SAXPY implementation on FPGAs. From our investigation, we have found that the NDRange kernel is more throughput efficient for SAXPY kernel operations.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2020 hgpu.org

All rights belong to the respective authors

Contact us: