Parallelization and Performance of the NIM Weather Model on CPU, GPU and MIC Processors
NOAA Earth System Research Laboratory. Global Systems Division, 325 Broadway, Boulder, Colorado 80305
Bulletin of the American Meteorological Society, 2017
@article{govett2017parallelization,
title={Parallelization and Performance of the NIM Weather Model on CPU, GPU and MIC Processors},
author={Govett, Mark and Rosinski, Jim and Middlecoff, Jacques and Henderson, Tom and Lee, Jin and MacDonald, Alexander and Wang, Ning and Madden, Paul and Schramm, Julie and Duarte, Antonio},
journal={Bulletin of the American Meteorological Society},
number={2017},
year={2017}
}
Next-generation super-computers containing millions of processors will require weather prediction models be designed and developed by teams of scientists, software engineers, and parallelization experts so they are portable and run efficiently on increasingly diverse HPC systems. The design and performance of the NIM global weather prediction model is described. NIM is a dynamical core designed to run on CPU, GPU and MIC processors. It demonstrates efficient parallel performance and scalability to tens of thousands of compute nodes, and has been an effective way to make comparisons between traditional CPU and emerging fine-grain processors. The design of the NIM also serves as a useful guide in the fine-grain parallelization of the FV3 model recently chosen by the NWS to become its next operational, global weather prediction model. This paper describes the code structure and parallelization of NIM using standards-compliant OpenMP and OpenACC directives. NIM uses the directives to support a single, performance-portable code that runs on CPU, GPU and MIC systems. Performance results are compared for five generations of computer chips including the recently released Intel Knights Landing and NVIDIA Pascal chips. Single and multi-node performance and scalability is also shown, along with a cost-benefit comparison based on vendor list prices.
July 19, 2017 by hgpu