Using modern C++ to improve CUDA programs
University of California, Davis
University of California, Davis, 2024
@mastersthesis{kuricheti2024using,
title={Using modern C++ to improve CUDA programs},
author={Kuricheti, Mythreya},
year={2024},
school={University of California, Davis}
}
The classic style of writing and porting HPC applications to the GPU uses pointers to buffers or data-structures as kernel parameters. This style discards type information, leading to “flattening” of CPU-side data-structures before using them as kernel parameters, followed by a need to reconstruct them in GPU code to retain flexibility. In this thesis, we identify several major problems during the porting process, including lack of vectors or views into a GPU buffer, bounds checking, iterator support, macro-dependent function specialization on the GPU, and GPU allocators for arbitrary types. These are all features that are already supported by CUDA in kernel code, but programmers are generally unable to use them due to data-structures decaying to pointers in kernel invocations. We demonstrate these problems and present techniques to overcome them in an implementation in C++ and CUDA. We use modern C++ features to make CPU-side features (such as iterators, ranged-for loops, and bounds checking) first-class citizens in GPU kernel code while maintaining interoperability with existing libraries. The result is a new ability to use CPU-style coding patterns in GPU kernel code. We demonstrate that our abstractions generate equally good assembly as the classical implementations. As a case study, we use the library to simplify the porting process of accelerating a shallow-water simulation framework “HEC-RAS” to the GPU.
October 27, 2024 by hgpu