29939

GPU Acceleration of SQL Analytics on Compressed Data

Zezhou Huang, Krystian Sakowski, Hans Lehnert, Wei Cui, Carlo Curino, Matteo Interlandi, Marius Dumitru, Rathijit Sen
Microsoft
arXiv:2506.10092 [cs.DB], (11 Jun 2025)

@misc{huang2025gpuaccelerationsqlanalytics,

   title={GPU Acceleration of SQL Analytics on Compressed Data},

   author={Zezhou Huang and Krystian Sakowski and Hans Lehnert and Wei Cui and Carlo Curino and Matteo Interlandi and Marius Dumitru and Rathijit Sen},

   year={2025},

   eprint={2506.10092},

   archivePrefix={arXiv},

   primaryClass={cs.DB},

   url={https://arxiv.org/abs/2506.10092}

}

Download Download (PDF)   View View   Source Source   

324

views

GPUs are uniquely suited to accelerate (SQL) analytics workloads thanks to their massive compute parallelism and High Bandwidth Memory (HBM) — when datasets fit in the GPU HBM, performance is unparalleled. Unfortunately, GPU HBMs remain typically small when compared with lower-bandwidth CPU main memory. Besides brute-force scaling across many GPUs, current solutions to accelerate queries on large datasets include leveraging data partitioning and loading smaller data batches in GPU HBM, and hybrid execution with a connected device (e.g., CPUs). Unfortunately, these approaches are exposed to the limitations of lower main memory and host-to-device interconnect bandwidths, introduce additional I/O overheads, or incur higher costs. This is a substantial problem when trying to scale adoption of GPUs on larger datasets. Data compression can alleviate this bottleneck, but to avoid paying for costly decompression/decoding, an ideal solution must include computation primitives to operate directly on data in compressed form.
No votes yet.
Please wait...

You must be logged in to post a comment.

Recent source codes

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: