Rethinking resampling in the particle filter on graphics processing units
CSIRO Mathematics, Informatics & Statistics
arXiv:1301.4019 [stat.CO], (17 Jan 2013)
@article{2013arXiv1301.4019M,
author={Murray}, L.~M. and {Lee}, A. and {Jacob}, P.~E.},
title={"{Rethinking resampling in the particle filter on graphics processing units}"},
journal={ArXiv e-prints},
archivePrefix={"arXiv"},
eprint={1301.4019},
primaryClass={"stat.CO"},
keywords={Statistics – Computation, Computer Science – Distributed, Parallel, and Cluster Computing},
year={2013},
month={jan},
adsurl={http://adsabs.harvard.edu/abs/2013arXiv1301.4019M},
adsnote={Provided by the SAO/NASA Astrophysics Data System}
}
Modern parallel computing devices such as the graphics processing unit (GPU) have gained significant traction in scientific computing, and are particularly well-suited to data-parallel algorithms such as the particle filter. Of the components of the particle filter, the resampling step is the most difficult to implement well on such devices, as it often requires a collective operation, such as a sum, across weights. We present and compare a number of resampling algorithms in this work, including rarely-used alternatives based on Metropolis and rejection sampling. We find that these alternative approaches perform significantly faster on the GPU than more common approaches such as the multinomial, stratified and systematic resamplers, a speedup attributable to the absence of collective operations. Moreover, in single-precision (particularly relevant on GPUs due to its faster performance), the common approaches are numerically unstable for plausibly large numbers of particles, while these alternative approaches are not. Finally, we provide a number of auxiliary functions of practical use in resampling, such as for the permutation of ancestry vectors to enable in-place propagation of particles.
January 18, 2013 by hgpu