High Performance Multi-agent System based Simulations

Prashant Sethia
Center for Data Engineering, International Institute of Information Technology, Hyderabad – 500 032, India
International Institute of Information Technology, 2011


   title={High Performance Multi-agent System based Simulations},

   author={Sethia, P.},



Download Download (PDF)   View View   Source Source   Source codes Source codes




Real-life city-traffic simulation presents a good example of multi-agent simulations involving a large number of agents (each human modelled as an individual agent). Analysis of emergent behaviors in social simulations largely depends on the number of agents involved (more than 100,000 agents at least). Due to large number of agents involved, it takes several seconds or even minutes to simulate a single second of real-life. Hence, we resort to distributed computing to speed-up the simulations. We used Hadoop to manage and execute multi-agent application on a small cloud of computers and used CUDA framework to develop an agent-simulation model on GPUs. Hadoop is known for efficiently supporting massive work-loads on a large number of systems with the help of its novel task distribution and the MapReduce model. It provides a fault-tolerant and failure-resilient underlying framework and these properties get propagated to the multi-agent simulation solution developed on top of it. Further, when some of the systems fail, Hadoop dynamically re-balances the work-load onremaining systems and provides capability to add new systems on the fly while the simulation is running. In our solution, agents are executed as MapReduce tasks; and a one time execution of all the agents in the simulation becomes a MapReduce job. Agents need to communicate with each other for making autonomous decisions. Agent communications may involve inter-node exchange of data causing network I/O to increase and become the bottle-neck in multi-agent simulations. Further, Hadoop does not process small files efficiently. We present an algorithm for grouping agents on the cluster of machines such that frequently communicating agents are brought together in a single file (as much as possible) avoiding inter-node communication. The speed-up achieved using Hadoop was limited by the overheads of a large number of MapReduce tasks running to execute agents and also due to the steps taken by Hadoop to provide failure re-silience. Moreover, it is slow to access and hard to append HDFS based files. Hence, we use Lucene index based mechanism to store agent data and implement agent messaging. With our implementation, we are able to achieve better performance and scalability in comparison to simulation frameworks currently available. We experimented with GPUs for speeding-up massive complex simulations, GPUs being managed with help of CUDA framework. CUDA provides the capability to create and manage large number (10^12) of light-weight GPU threads. It follows Single Instruction Multiple Threads architecture; the single instruction being modelled as a kernel function. Agent strategies are hence developed as kernel functions with each agent running as a separate GPU thread. The consistency in agent-states are maintained by using a four-state agent execution model. The model decentralizes the problem of consistency and helps in avoiding bottle-necks that may have occurred due to a single centralized system. We tested performance of the two developed frameworks and compared them with the existing solutions like NetLogo, MASON and DMASF. The experiments involved both, simulations having a lot of computation and simulations with large number of messages amongst agents. The framework developed on Hadoop provided a linear scale-up in the number of running agents with the increase in number of machines in the cluster. Time taken to execute a simulation for a fixed number of agents decreased inversely as number of machines increased. Hadoop rebalanced the simulation, with very little overhead in time, when some machines failed and new ones were added dynamically. However, the execution time was upto 6-8 times more as compared with DMASF with same experimental settings. But with the help of Lucene indexing along with Hadoop, the execution-time improved and for larger number of agents our system outperformed it. The GPU solution on the other hand outper-formed all the existing CPU solutions in terms of execution time. The speed-up achieved was as much as 10,000 times for some simulations when compared with DMASF running on 2 CPUs for same experiments.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2021 hgpu.org

All rights belong to the respective authors

Contact us: