Efficient simulation of agent-based models on multi-GPU and multi-core clusters

Brandon G. Aaby, Kalyan S. Perumalla, Sudip K. Seal
Oak Ridge National Laboratory, Oak Ridge, Tennessee
SIMUTools ’10 Proceedings of the 3rd International ICST Conference on Simulation Tools and Techniques


   title={Efficient simulation of agent-based models on multi-GPU and multi-core clusters},

   author={Aaby, B.G. and Perumalla, K.S. and Seal, S.K.},

   booktitle={Proceedings of the 3rd International ICST Conference on Simulation Tools and Techniques},



   organization={ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering)}


Download Download (PDF)   View View   Source Source   



An effective latency-hiding mechanism is presented in the parallelization of agent-based model simulations (ABMS) with millions of agents. The mechanism is designed to accommodate the hierarchical organization as well as heterogeneity of current state-of-the-art parallel computing platforms. We use it to explore the computation vs. communication trade-off continuum available with the deep computational and memory hierarchies of extant platforms and present a novel analytical model of the tradeoff. We describe our implementation and report preliminary performance results on two distinct parallel platforms suitable for ABMS: CUDA threads on multiple, networked graphical processing units (GPUs), and pthreads on multi-core processors. Message Passing Interface (MPI) is used for inter-GPU as well as inter-socket communication on a cluster of multiple GPUs and multi-core processors. Results indicate the benefits of our latency-hiding scheme, delivering as much as over 100-fold improvement in runtime for certain benchmark ABMS application scenarios with several million agents. This speed improvement is obtained on our system that is already two to three orders of magnitude faster on one GPU than an equivalent CPU-based execution in a popular simulator in Java. Thus, the overall execution of our current work is over four orders of magnitude faster when executed on multiple GPUs.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2020 hgpu.org

All rights belong to the respective authors

Contact us: