29345

Portability of Fortran’s ‘do concurrent’ on GPUs

Ronald M. Caplan, Miko M. Stulajter, Jon A. Linker, Jeff Larkin, Henry A. Gabb, Shiquan Su, Ivan Rodriguez, Zachary Tschirhart, Nicholas Malaya
Predictive Science Inc., San Diego, CA USA
arXiv:2408.07843 [cs.PL], (14 Aug 2024)

@misc{caplan2024portabilityfortransdoconcurrent,

   title={Portability of Fortran’s `do concurrent’ on GPUs},

   author={Ronald M. Caplan and Miko M. Stulajter and Jon A. Linker and Jeff Larkin and Henry A. Gabb and Shiquan Su and Ivan Rodriguez and Zachary Tschirhart and Nicholas Malaya},

   year={2024},

   eprint={2408.07843},

   archivePrefix={arXiv},

   primaryClass={cs.PL},

   url={https://arxiv.org/abs/2408.07843}

}

There is a continuing interest in using standard language constructs for accelerated computing in order to avoid (sometimes vendor-specific) external APIs. For Fortran codes, the {tt do concurrent} (DC) loop has been successfully demonstrated on the NVIDIA platform. However, support for DC on other platforms has taken longer to implement. Recently, Intel has added DC GPU offload support to its compiler, as has HPE for AMD GPUs. In this paper, we explore the current portability of using DC across GPU vendors using the in-production solar surface flux evolution code, HipFT. We discuss implementation and compilation details, including when/where using directive APIs for data movement is needed/desired compared to using a unified memory system. The performance achieved on both data center and consumer platforms is shown.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: