7391

A High-Performance Brownian Bridge for GPUs: Lessons for Bandwidth Bound Applications

Jacques Du Toit
NAG Ltd, Manchester
NAG Technical Report, TR2/12, 2012

@article{toit2012high,

   title={A High-Performance Brownian Bridge for GPUs: Lessons for Bandwidth Bound Applications},

   author={Toit, Jacques Du},

   year={2012}

}

We present a very exible Brownian bridge generator together with a GPU implementation which achieves close to peak performance on an NVIDIA C2050. The performance is compared with an OpenMP implementation run on several high performance x86-64 systems. The GPU shows a performance gain of at least 10x. Full comparative results are given in Section 8: in particular, we observe that the Brownian bridge algorithm does not scale well on multicore CPUs since it is memory bandwidth bound. The evolution of the GPU algorithm is discussed. Achieving peak performance required challenging the "conventional wisdom" regarding GPU programming, in particular the importance of occupancy, the speed of shared memory and the impact of branching.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: