30267

Towards GPU Parallelism Abstractions in Rust: A Case Study with Linear Pipelines

Leonardo Gibrowski Faé, Dalvan Griebler
School of Technology, Pontifical Catholic University of Rio Grande do Sul, Porto Alegre, Brazil
Simpósio Brasileiro de Linguagens de Programação (SBLP), 29, 2025

@inproceedings{fae2025towards,

   title={Towards GPU Parallelism Abstractions in Rust: A Case Study with Linear Pipelines},

   author={Fa{‘e}, Leonardo Gibrowski and Griebler, Dalvan},

   booktitle={Simp{‘o}sio Brasileiro de Linguagens de Programa{c{c}}{~a}o (SBLP)},

   pages={75–83},

   year={2025},

   organization={SBC}

}

Download Download (PDF)   View View   Source Source   

865

views

Programming Graphics Processing Units (GPUs) for general-purpose computation remains a daunting task, often requiring specialized knowledge of low-level APIs like CUDA or OpenCL. While Rust has emerged as a modern, safe, and performant systems programming language, its adoption in the GPU computing domain is still nascent. Existing approaches often involve intricate compiler modifications or complex static analysis to adapt CPU-centric Rust code for GPU execution. This paper presents a novel high-level abstraction in Rust, leveraging procedural macros to automatically generate GPU-executable code from constrained Rust functions. Our approach simplifies the code generation process by imposing specific limitations on how these functions can be written, thereby avoiding the need for complex static analysis.We demonstrate the feasibility and effectiveness of our abstraction through a case study involving linear pipeline parallel patterns, a common structure in data-parallel applications. By transforming Rust functions annotated as source, stage, or sink in a pipeline, we enable straightforward execution on the GPU. We evaluate our abstraction’s performance and programmability using two benchmark applications: sobel (image filtering) and latbol (fluid simulation), comparing it against manual OpenCL implementations. Our results indicate that while incurring a small performance overhead in some cases, our approach significantly reduces development effort and, in certain scenarios, achieves comparable or even superior throughput compared to CPU-based parallelism.
No votes yet.
Please wait...

You must be logged in to post a comment.

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: