30263

Mojo: MLIR-Based Performance-Portable HPC Science Kernels on GPUs for the Python Ecosystem

William F. Godoy, Tatiana Melnichenko, Pedro Valero-Lara, Wael Elwasif, Philip Fackler, Rafael Ferreira Da Silva, Keita Teranishi, Jeffrey S. Vetter
Oak Ridge National Laboratory, Oak Ridge, TN, USA
arXiv:2509.21039 [cs.DC], (25 Sep 2025)

@misc{godoy2025mojo,

   title={Mojo: MLIR-Based Performance-Portable HPC Science Kernels on GPUs for the Python Ecosystem},

   author={William F. Godoy and Tatiana Melnichenko and Pedro Valero-Lara and Wael Elwasif and Philip Fackler and Rafael Ferreira Da Silva and Keita Teranishi and Jeffrey S. Vetter},

   year={2025},

   eprint={2509.21039},

   archivePrefix={arXiv},

   primaryClass={cs.DC}

}

We explore the performance and portability of the novel Mojo language for scientific computing workloads on GPUs. As the first language based on the LLVM’s Multi-Level Intermediate Representation (MLIR) compiler infrastructure, Mojo aims to close performance and productivity gaps by combining Python’s interoperability and CUDA-like syntax for compile-time portable GPU programming. We target four scientific workloads: a seven-point stencil (memory-bound), BabelStream (memory-bound), miniBUDE (compute-bound), and Hartree-Fock (compute-bound with atomic operations); and compare their performance against vendor baselines on NVIDIA H100 and AMD MI300A GPUs. We show that Mojo’s performance is competitive with CUDA and HIP for memory-bound kernels, whereas gaps exist on AMD GPUs for atomic operations and for fast-math compute-bound kernels on both AMD and NVIDIA GPUs. Although the learning curve and programming requirements are still fairly low-level, Mojo can close significant gaps in the fragmented Python ecosystem in the convergence of scientific computing and AI.
No votes yet.
Please wait...

You must be logged in to post a comment.

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: