One-shot tuner for deep learning compilers
Department of Computer Science and Engineering, POSTECH, Pohang, South Korea
31st ACM SIGPLAN International Conference on Compiler Construction, 2022
@inproceedings{ryu2022one,
title={One-shot tuner for deep learning compilers},
author={Ryu, Jaehun and Park, Eunhyeok and Sung, Hyojin},
booktitle={Proceedings of the 31st ACM SIGPLAN International Conference on Compiler Construction},
pages={89–103},
year={2022}
}
Auto-tuning DL compilers are gaining ground as an optimizing back-end for DL frameworks. While existing work can generate deep learning models that exceed the performance of hand-tuned libraries, they still suffer from prohibitively long auto-tuning time due to repeated hardware measurements in large search spaces. In this paper, we take a neural-predictor inspired approach to reduce the auto-tuning overhead and show that a performance predictor model trained prior to compilation can produce optimized tensor operation codes without repeated search and hardware measurements. To generate a sample-efficient training dataset, we extend input representation to include task-specific information and to guide data sampling methods to focus on learning high-performing codes. We evaluated the resulting predictor model, One-Shot Tuner, against AutoTVM and other prior work, and the results show that One-Shot Tuner speeds up compilation by 2.81x to 67.7x compared to prior work while providing comparable or improved inference time for CNN and Transformer models.
March 27, 2022 by hgpu