Pulsar search acceleration using FPGAs and OpenCL templates

Julian Oppermann, Mitchell B. Mickaliger, Oliver Sinnen
University of Auckland, New Zealand
Experimental Astronomy, 2023


   title={Pulsar search acceleration using FPGAs and OpenCL templates},

   author={Oppermann, Julian and Mickaliger, Mitchell B and Sinnen, Oliver},

   journal={Experimental Astronomy},





The Square Kilometre Array (SKA) is the world’s largest radio telescope currently under construction, and will employ elaborate signal processing to detect new pulsars, i.e. highly magnetised rotating neutron stars. This paper addresses the acceleration of demanding computations for this pulsar search on Field-Programmable Gate Arrays (FPGAs) using a new high-level design process based on OpenCL templates that is transferable to other scientific problems. The successful FPGA acceleration of large-scale scientific workloads requires custom architectures that fully exploit the parallel computing capabilities of modern reconfigurable hardware and are amenable to substantial design space exploration. OpenCL-based high-level synthesis toolchains, with their ability to express interconnected multi-kernel pipelines in a single source language, excel in this domain. However, the achievable performance strongly depends on how well the compiler can infer desirable hardware structures from the code. One key aspect to excellent performance is commonly the uninterrupted, high-bandwidth streaming of data into and through the design. This is difficult to achieve in complex designs when data order needs to be re-arranged, e.g. transposed. It is equally hard to pre-fetch and burst-load from DDR memory when reading occurs in non-trivial patterns. In this paper, we propose new approaches to these two problems that use OpenCL-based code templates. We demonstrate the practical benefits of these approaches with the acceleration of a key component in the SKA’s pulsar search pipeline: the Fourier Domain Acceleration Search (FDAS) module. Using our proposed methodology, we are able to develop a more scalable FDAS accelerator architecture than previously possible. We explore its design space to eventually achieve a 10x throughput improvement over a prior, thoroughly optimised implementation in plain OpenCL.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: