29240

More Bang For Your Buck(et): Fast and Space-efficient Hardware-accelerated Coarse-granular Indexing on GPUs

Justus Henneberg, Felix Schuhknecht, Rosina Kharal, Trevor Brown
Johannes Gutenberg University Mainz, Germany
arXiv:2406.03965 [cs.DB], (6 Jun 2024)

@misc{henneberg2024bang,

   title={More Bang For Your Buck(et): Fast and Space-efficient Hardware-accelerated Coarse-granular Indexing on GPUs},

   author={Justus Henneberg and Felix Schuhknecht and Rosina Kharal and Trevor Brown},

   year={2024},

   eprint={2406.03965},

   archivePrefix={arXiv},

   primaryClass={cs.DB}

}

Download Download (PDF)   View View   Source Source   

697

views

In recent work, we have shown that NVIDIA’s raytracing cores on RTX video cards can be exploited to realize hardware-accelerated lookups for GPU-resident database indexes. On a high level, the concept materializes all keys as triangles in a 3D scene and indexes them. Lookups are performed by firing rays into the scene and utilizing the index structure to detect hits in a hardware-accelerated fashion. While this approach called RTIndeX (or short RX) is indeed promising, it currently suffers from three limitations: (1) significant memory overhead per key, (2) slow range-lookups, and (3) poor updateability. In this work, we show that all three problems can be tackled by a single design change: Generalizing RX to become a coarse-granular index cgRX. Instead of indexing individual keys, cgRX indexes buckets of keys which are post-filtered after retrieval. This drastically reduces the memory overhead, leads to the generation of a smaller and more efficient index structure, and enables fast range-lookups as well as updates. We will see that representing the buckets in the 3D space such that the lookup of a key is performed both correctly and efficiently requires the careful orchestration of firing rays in a specific sequence. Our experimental evaluation shows that cgRX offers the most bang for the buck(et) by providing a throughput in relation to the memory footprint that is 1.5-3x higher than for the comparable range-lookup supporting baselines. At the same time, cgRX improves the range-lookup performance over RX by up to 2x and offers practical updateability that is up to 5.5x faster than rebuilding from scratch.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: