high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Column-Oriented Datalog on the GPU

Column-Oriented Datalog on the GPU

Yihao Sun, Sidharth Kumar, Thomas Gilray, Kristopher Micinski

Syracuse University

arXiv:2501.13051 [cs.DB], (22 Jan 2025)

DOI:10.48550/arXiv.2501.13051

@misc{sun2025columnorienteddataloggpu,

title={Column-Oriented Datalog on the GPU},

author={Yihao Sun and Sidharth Kumar and Thomas Gilray and Kristopher Micinski},

year={2025},

eprint={2501.13051},

archivePrefix={arXiv},

primaryClass={cs.DB},

url={https://arxiv.org/abs/2501.13051}

}

Download (PDF)

View

Source

Source codes

Package:

VFLog: vertical + gpu + free join + datalog

1501

views

Datalog is a logic programming language widely used in knowledge representation and reasoning (KRR), program analysis, and social media mining due to its expressiveness and high performance. Traditionally, Datalog engines use either row-oriented or column-oriented storage. Engines like VLog and Nemo favor column-oriented storage for efficiency on limited-resource machines, while row-oriented engines like Souffle use advanced data structures with locking to perform better on multi-core CPUs. The advent of modern datacenter GPUs, such as the NVIDIA H100 with its ability to run over 16k threads simultaneously and high memory bandwidth, has reopened the debate on which storage layout is more effective. This paper presents the first column-oriented Datalog engines tailored to the strengths of modern GPUs. We present VFLog, a CUDA-based Datalog runtime library with a column-oriented GPU datastructure that supports all necessary relational algebra operations. Our results demonstrate over 200x performance gains over SOTA CPU-based column-oriented Datalog engines and a 2.5x speedup over GPU Datalog engines in various workloads, including KRR.

Tags: Computer science, CUDA, Databases, nVidia, nVidia H100, Package

January 27, 2025 by hgpu

No votes yet.

Please wait...