https://hgpu.org/?p=4337
A Tuned and Scalable Fast Multipole Method as a Preeminent Algorithm for Exascale Systems