UPC on MIC: Early Experiences with Native and Symmetric Modes
Department of Computer Science and Engineering, The Ohio State University
@article{luoupc,
title={UPC on MIC: Early Experiences with Native and Symmetric Modes},
author={Luo, Miao and Li, Mingzhe and Venkatesh, Akshay and Lu, Xiaoyi and Panda, Dhabaleswar K DK}
}
Intel Many Integrated Core (MIC) architecture is steadily being adopted in clusters owing to its high compute throughput and power efficiency. The current generation MIC coprocessor, Xeon Phi, provides a highly multi-threaded environment with support for multiple programming models. While regular programming models such as MPI/OpenMP have started utilizing systems with MIC coprocessors, it is still not clear whether PGAS models can easily adopt and fully utilize such systems. In this paper, we discuss several ways of running UPC applications on the MIC architecture under Native/Symmetric programming mode. These methods include the choice of process-based or thread-based UPC runtime for native mode and different communication channels between MIC and host for symmetric mode. We propose a thread-based UPC runtime with an improved “leader-to-all” connection scheme over InfiniBand and SCIF [3] through multi-endpoint support. For the native mode, we evaluate point-to-point and collective micro-benchmarks, Global Array Random Access, UTS and NAS benchmarks. For the symmetric mode, we evaluate the communication performance between host and MIC within a single node. Through our evaluations, we explore the effects of scaling UPC threads on the MIC and also highlight the bottlenecks (up to 10X degradation) involved in UPC communication routines arising from the per-core processing and memory limitations on the MIC. To the best of our knowledge, this is the first paper that evaluates UPC programming model on MIC systems.
October 9, 2013 by hgpu