Speed and Portability issues for Random Number Generation on Graphical Processing Units with CUDA and other Processing Accelerators
Computer Science, Institute of Information and Mathematical Sciences, Massey University, Albany, North Shore 102-904, Auckland, New Zealand
Technical Report CSTN-103, Massey University, 2010
@article{hawickspeed,
title={Speed and Portability issues for Random Number Generation on Graphical Processing Units with CUDA and other Processing Accelerators},
author={Hawick, KA and Leist, A. and Playne, DP and Johnson, MJ}
}
Generating quality random numbers is a performance-critical application for many scientific simulations. Modern processing acceleration techniques such as: graphical co-processing units(GPUs), multi-core conventional CPUs; special purpose multicore CPUs; and parallel computing approaches such as multi-threading on shared memory or message passing on clusters, all offer ways to speed up random number generation (RNG). Providing fast generators that are also portable across hardware and software platforms is non-trivial however, particularly since many of the powerful devices available at present do not yet support full 64-bit operations upon which many good RNG algorithms rely. We report performance data for a range of common RNG algorithms on devices including: GPUs; CellBE; multicore CPUs; and hybrids, and discuss algorithmic and implementation issues.
February 13, 2011 by hgpu