An experimental study of group-by and aggregation on CPU-GPU processors
School of Artificial Intelligence, Beijing Normal University, Beijing, China
Journal of Engineering and Applied Science, volume 69, 2022
@article{luan2022experimental,
title={An experimental study of group-by and aggregation on CPU-GPU processors},
author={Luan, Hua and Chang, Lei},
journal={Journal of Engineering and Applied Science},
volume={69},
number={1},
pages={1–27},
year={2022},
publisher={SpringerOpen}
}
Hash-based group-by and aggregation is a fundamental operator in database systems. Modern discrete GPUs (graphics processing units) have been considered to accelerate the performance. However, the data transfer through the PCIe (peripheral component interconnect express) bus would reduce gains. On recent architectures, the GPU and the CPU (central processing unit) are built into the same chip which removes the data transmission and offers new performance opportunities. Yet there has been no systematic analysis of grouping and aggregation algorithms on such architectures. In this paper, we study the behaviors of various hash-based grouping and aggregation methods on coupled architectures to provide meaningful guidelines. We conduct an extensive experimental study and analysis on the single CPU, the coupled GPU, and both processors. Six dimensions are considered in analyzing the hashing methods carefully: (1) hashing scheme, (2) hash function, (3) data size, (4) group cardinality, (5) load factor, and (6) data distribution. Two additional dimensions are also explored: (7) shared and independent hash tables and (8) running on single processors and co-processing. We hope the results in our study could help database researchers to choose the right direction in terms of algorithm design and system optimization.
June 26, 2022 by hgpu