cuda-kat: The CUDA Kernel Author’s Toolkit

Eyal Rozenberg



An install-less, header-only library which is a loosely-coupled collection of utility functions and classes for writing device-side CUDA code (kernels and non-kernel functions). These let us:

* Write templated device-side without constantly coming up against not-trivially-templatable bits.
* Use standard-library(-like) containers in device-side code (but not have to use them).
* Not repeat ourselves as much (the DRY principle).
* Use less magic numbers.
* Make our device-side code less cryptic and idiosyncratic, with clearer naming and semantics.

… while not committing to any particular framework, paradigm or class hierarchy – and not compromising performance.

Library facilities include:

Templated versions of math functions | GPU-enabled versions of std::array, std::span and std::tuple | Wrapper functions for non-exposed PTX instructions | Templated versions of PTX intrinsic | Warp-, block- and grid-level sequence operations | Warp-, block- and grid-level atomic mechanisms | effective access to shared memory | on-device stringsteams and ostreaam like classes on the device. | etc.

No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: