7047

Accelerating incoherent dedispersion

Benjamin R. Barsdell, Matthew Bailes, David G. Barnes, Christopher J. Fluke
Centre for Astrophysics and Supercomputing, Swinburne University of Technology, PO Box 218, Hawthorn, Australia, 3122
arXiv:1201.5380v1 [astro-ph.IM] (25 Jan 2012)

@article{2012arXiv1201.5380B,

   author={Barsdell, Benjamin R. and Bailes, Matthew and Barnes, David G. and Fluke, Christopher J.},

   title={"{Accelerating incoherent dedispersion}"},

   journal={ArXiv e-prints},

   archivePrefix={"arXiv"},

   eprint={1201.5380},

   primaryClass={"astro-ph.IM"},

   keywords={Instrumentation and Methods for Astrophysics},

   year={2012},

   month={jan}

}

Download Download (PDF)   View View   Source Source   Source codes Source codes

Package:

1273

views

Incoherent dedispersion is a computationally intensive problem that appears frequently in pulsar and transient astronomy. For current and future transient pipelines, dedispersion can dominate the total execution time, meaning its computational speed acts as a constraint on the quality and quantity of science results. It is thus critical that the algorithm be able to take advantage of trends in commodity computing hardware. With this goal in mind, we present analysis of the ‘direct’, ‘tree’ and ‘sub-band’ dedispersion algorithms with respect to their potential for efficient execution on modern graphics processing units (GPUs). We find all three to be excellent candidates, and proceed to describe implementations in C for CUDA using insight gained from the analysis. Using recent CPU and GPU hardware, the transition to the GPU provides a speed-up of 9x for the direct algorithm when compared to an optimised quad-core CPU code. For realistic recent survey parameters, these speeds are high enough that further optimisation is unnecessary to achieve real-time processing. Where further speed-ups are desirable, we find that the tree and sub-band algorithms are able to provide 3-7x better performance at the cost of certain smearing, memory consumption and development time trade-offs. We finish with a discussion of the implications of these results for future transient surveys. Our GPU dedispersion code is publicly available as a C library at: this http URL
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: