Warp-Level Parallelism: Enabling Multiple Replications In Parallel on GPU
SIMA – Institut Superieur d’Informatique, de Modelisation et de leurs Applications, F-63173 AUBIERE
25th annual European Simulation and Modelling Conference (ESM 2011), 2011
@article{passerat2011warp,
title={Warp-Level Parallelism: Enabling Multiple Replications In Parallel on GPU},
author={Passerat-Palmbach, J. and Caux, J. and Siregar, P. and Mazel, C. and Hill, DRC},
year={2011}
}
Stochastic simulations need multiple replications in order to build confidence intervals for their results. Even if we do not need a large amount of replications, it is a good practice to speed-up the whole simulation time using the Multiple Replications In Parallel (MRIP) approach. This approach usually supposes to have access to a parallel computer such as a symmetric multipro-cessing machine (with many cores), a computing cluster or a computing grid. In this paper, we propose Warp-Level Parallelism (WLP), a GPGPU-enabled solution to compute MRIP on GPGPUs (General-Purpose Graphics Processing Units). These devices display a great amount of parallel computational power at low cost, but are tuned to process efficiently the same operation on several data, through different threads. Indeed, this paradigm is called Single Instruction, Multiple Threads (SIMT). Our approach proposes to rely on small threads groups, called warps, to perform independent computations such as replications. We have benchmarked WLP with three different models: it allows MRIP to be computed up to six times faster than with the SIMT computing paradigm.
December 21, 2011 by hgpu