Benchmarking optimization algorithms for auto-tuning GPU kernels
Computational Imaging Group, Centrum Wiskunde & Informatica, Amsterdam, Netherlands
arXiv:2210.01465 [cs.DC], (4 Oct 2022)
@article{Schoonhoven_2022,
doi={10.1109/tevc.2022.3210654},
url={https://doi.org/10.1109%2Ftevc.2022.3210654},
year={2022},
publisher={Institute of Electrical and Electronics Engineers ({IEEE})},
pages={1–1},
author={Richard Schoonhoven and Ben van Werkhoven and K. Joost Batenburg},
title={Benchmarking optimization algorithms for auto-tuning {GPU} kernels},
journal={IEEE Transactions on Evolutionary Computation}
}
Recent years have witnessed phenomenal growth in the application, and capabilities of Graphical Processing Units (GPUs) due to their high parallel computation power at relatively low cost. However, writing a computationally efficient GPU program (kernel) is challenging, and generally only certain specific kernel configurations lead to significant increases in performance. Auto-tuning is the process of automatically optimizing software for highly-efficient execution on a target hardware platform. Auto-tuning is particularly useful for GPU programming, as a single kernel requires re-tuning after code changes, for different input data, and for different architectures. However, the discrete, and non-convex nature of the search space creates a challenging optimization problem. In this work, we investigate which algorithm produces the fastest kernels if the time-budget for the tuning task is varied. We conduct a survey by performing experiments on 26 different kernel spaces, from 9 different GPUs, for 16 different evolutionary black-box optimization algorithms. We then analyze these results and introduce a novel metric based on the PageRank centrality concept as a tool for gaining insight into the difficulty of the optimization problem. We demonstrate that our metric correlates strongly with observed tuning performance.
October 9, 2022 by hgpu