Analysis of High Level implementations for Recursive Methods on GPUs

Cheng Cao, Justin Kalloor
University of California, Berkeley, CA
CS262a: Fall 2021 Final Projects, 2021


   title={Analysis of High Level implementations for Recursive Methods on GPUs},

   author={Cao, Cheng and Kalloor, Justin},



Download Download (PDF)   View View   Source Source   



Higher level DSLs have allowed for performant computation on GPUs while providing enough abstraction to the user to avoid significant deployment overhead. However, the SIMD/SIMT model of programming still can encounter unexpected performance drops when trying to translate naively from CPU code. One example of these performance drops is branch divergence, and this failure is especially exacerbated by recursive methods, as the depth of these recursions can vary greatly between threads. This paper investigates ways to enable recursion and task oriented programming using the Taichi DSL. We first present different methods of accomplishing this task, and benchmark each. Utilizing Taichi’s multiple back-end code-gen targets, we investigate the performance of recursion tasks on different backends. We compile these results into a final model that automatically chooses the best implementation for a given user program. In our benchmarks, we see massive improvement in throughput over the naive implementation, up to 500%.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: