OpenCL FPGA Optimization guided by memory accesses and roofline model analysis applied to tomography acceleration
Universite Paris-Saclay, CNRS, CentraleSupelec, L2S, 91190, Gif-sur-Yvette, France
hal-03226257, (29 June 2021)
@inproceedings{diakite2021opencl,
title={OpenCL FPGA Optimization guided by memory accesses and roofline model analysis applied to tomography acceleration},
author={Diakite, Daouda and Gac, Nicolas and Martelli, Maxime},
booktitle={31st International Conference on Field Programmable Logic and Applications (FPL)},
year={2021}
}
Backward projection is one of the most time-consuming steps in method-based iterative reconstruction computed tomography. The 3D backprojection memory access pattern is potentially enough regular to exploit efficiently the computation power of acceleration boards based on GPU or FPGA. The highlevel tools like HLS or OpenCL ease consider such particular memory accesses during the design flow without specific hardware IPs. This paper proposes an OpenCL acceleration of the voxel-driven 3D back-projection algorithm on an Arria 10 FPGA. This design flow is based initially on an offline memory access analysis, then iteratively on a performance analysis of each new implementation represented on a Berkeley Roofline model. By taking advantage of the FPGAs local memory architecture, we have succeeded to design an efficient pipeline reaching maximum bandwidth with stall-free access underlining this platform’s interest for memory optimization. Our design flow allowed for a significant improvement of our initial algorithm’s computational intensity, resulting in better performance on FPGA. It reaches comparable performance to an embedded GPU implementation and other computed tomography algorithms on FPGAs.
July 18, 2021 by hgpu