30314

Tutoring LLM into a Better CUDA Optimizer

Matyáš Brabec, Jiří Klepl, Michal Töpfer, Martin Kruliš
Charles University, Malostranské náměstí 25, 118 00 Praha 1, Czech Republic
arXiv:2510.16933 [cs.DC], (19 Oct 2025)

@inbook{Brabec_2025,

   title={Tutoring LLM into a Better CUDA Optimizer},

   ISBN={9783031998577},

   ISSN={1611-3349},

   url={http://dx.doi.org/10.1007/978-3-031-99857-7_18},

   DOI={10.1007/978-3-031-99857-7_18},

   booktitle={Euro-Par 2025: Parallel Processing},

   publisher={Springer Nature Switzerland},

   author={Brabec, Matyáš and Klepl, Jiří and Töpfer, Michal and Kruliš, Martin},

   year={2025},

   month={aug},

   pages={250–263}

}

Recent leaps in large language models (LLMs) caused a revolution in programming tools (like GitHub Copilot) that can help with code generation, debugging, and even performance optimization. In this paper, we focus on the capabilities of the most recent reasoning models to generate optimized CUDA code for predefined, well-known tasks. Our objective is to determine which types of code optimizations and parallel patterns the LLMs can perform by themselves and whether they can be improved by tutoring (providing more detailed hints and guidelines in the prompt). The generated solutions were evaluated both automatically (for correctness and speedup) and manually (code reviews) to provide a more detailed perspective. We also tried an interactive approach where the LLM can fix its previous mistakes within a session. The results indicate that LLMs are quite skilled coders; however, they require tutoring to reach optimized solutions provided by parallel computing experts.
No votes yet.
Please wait...

You must be logged in to post a comment.

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: