23787

When HLS Meets FPGA HBM: Benchmarking and Bandwidth Optimization

Young-kyu Choi, Yuze Chi, Jie Wang, Licheng Guo, Jason Cong
Computer Science Department, University of California, Los Angeles, CA 90095
arXiv:2010.06075 [cs.AR], (12 Oct 2020)

@misc{choi2020hls,

   title={When HLS Meets FPGA HBM: Benchmarking and Bandwidth Optimization},

   author={Young-kyu Choi and Yuze Chi and Jie Wang and Licheng Guo and Jason Cong},

   year={2020},

   eprint={2010.06075},

   archivePrefix={arXiv},

   primaryClass={cs.AR}

}

Download Download (PDF)   View View   Source Source   

1397

views

With the recent release of High Bandwidth Memory (HBM) based FPGA boards, developers can now exploit unprecedented external memory bandwidth. This allows more memory-bounded applications to benefit from FPGA acceleration. However, we found that it is not easy to fully utilize the available bandwidth when developing some applications with high-level synthesis (HLS) tools. This is due to the limitation of existing HLS tools when accessing HBM board’s large number of independent external memory channels. In this paper, we measure the performance of three recent representative HBM FPGA boards (Intel’s Stratix 10 MX and Xilinx’s Alveo U50/U280 boards) with microbenchmarks and analyze the HLS overhead. Next, we propose HLS-based optimization techniques to improve the effective bandwidth when a PE accesses multiple HBM channels or multiple PEs access an HBM channel. Our experiment demonstrates that the effective bandwidth improves by 2.4X-3.8X. We also provide a list of insights for future improvement of the HBM FPGA HLS design flow.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: