Scalable heterogeneous parallelism for atmospheric modeling and simulation
Department of Computer Science, Virginia Polytechnic Institute and State University, 2202 Kraft Dr., Blacksburg, VA 24061
The Journal of Supercomputing, Volume 56, Number 3, 300-327, 2011
@article{linford2011scalable,
title={Scalable heterogeneous parallelism for atmospheric modeling and simulation},
author={Linford, J.C. and Sandu, A.},
journal={The Journal of Supercomputing},
volume={56},
number={3},
pages={300–327},
year={2011},
publisher={Springer}
}
Heterogeneous multicore chipsets with many levels of parallelism are becoming increasingly common in high-performance computing systems. Effective use of parallelism in these new chipsets constitutes the challenge facing a new generation of large scale scientific computing applications. This study examines methods for improving the performance of two-dimensional and three-dimensional atmospheric constituent transport simulation on the Cell Broadband Engine Architecture (CBEA). A function offloading approach is used in a 2D transport module, and a vector stream processing approach is used in a 3D transport module. Two methods for transferring incontiguous data between main memory and accelerator local storage are compared. By leveraging the heterogeneous parallelism of the CBEA, the 3D transport module achieves performance comparable to two nodes of an IBM BlueGene/P, or eight Intel Xeon cores, on a single PowerXCell 8i chip. Module performance on two CBEA systems, an IBM BlueGene/P, and an eight-core shared-memory Intel Xeon workstation are given.
September 20, 2011 by hgpu