27017

The OpenMP Cluster Programming Model

Hervé Yviquel, Marcio Pereira, Emílio Francesquini, Guilherme Valarini, Gustavo Leite, Pedro Rosso, Rodrigo Ceccato, Carla Cusihualpa, Vitoria Dias, Sandro Rigo, Alan Sousa, Guido Araujo
Institute of Computing, University of Campinas, UNICAMP, Brazil
arXiv:2207.05677 [cs.DC], (12 Jul 2022)

@article{yviquel2022openmp,

   title={The OpenMP Cluster Programming Model},

   author={Yviquel, Hervé and Pereira, Marcio and Francesquini, Emílio and Valarini, Guilherme and Leite, Gustavo and Rosso, Pedro and Ceccato, Rodrigo and Cusihualpa, Carla and Dias, Vitoria and Rigo, Sandro and Sousa, Alan and Araujo, Guido},

   year={2022}

}

Download Download (PDF)   View View   Source Source   Source codes Source codes

Package:

732

views

Despite the various research initiatives and proposed programming models, efficient solutions for parallel programming in HPC clusters still rely on a complex combination of different programming models (e.g., OpenMP and MPI), languages (e.g., C++ and CUDA), and specialized runtimes (e.g., Charm++ and Legion). On the other hand, task parallelism has shown to be an efficient and seamless programming model for clusters. This paper introduces OpenMP Cluster (OMPC), a task-parallel model that extends OpenMP for cluster programming. OMPC leverages OpenMP’s offloading standard to distribute annotated regions of code across the nodes of a distributed system. To achieve that it hides MPI-based data distribution and load-balancing mechanisms behind OpenMP task dependencies. Given its compliance with OpenMP, OMPC allows applications to use the same programming model to exploit intra- and inter-node parallelism, thus simplifying the development process and maintenance. We evaluated OMPC using Task Bench, a synthetic benchmark focused on task parallelism, comparing its performance against other distributed runtimes. Experimental results show that OMPC can deliver up to 1.53x and 2.43x better performance than Charm++ on CCR and scalability experiments, respectively. Experiments also show that OMPC performance weakly scales for both Task Bench and a real-world seismic imaging application.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: