Kernel-Centric Optimizations for Deep Neural Networks on GPGPU
University of California, Santa Barbara
University of California, 2024
@phdthesis{chen2024kernel,
title={Kernel-Centric Optimizations for Deep Neural Networks on GPGPU},
author={Chen, Zhaodong},
year={2024},
school={UC Santa Barbara}
}
Deep learning has achieved remarkable success across various domains, ranging from computer vision to healthcare. General-Purpose Graphics Processing Unit (GPGPU) is one of the major driving forces behind this revolution. GPGPUs offer massive parallel computational power, enabling the training and deployment of large-scale neural networks within practical time and resource constraints. Their programmability also enables adaptability to emerging network architectures. However, entering the post-Moore’s Law era, the scaling of computational power offered by GPGPUs struggles to meet the demands of novel neural networks. On the other hand, existing GPGPUs face under-utilization challenges despite the computation power shortage. This dissertation addresses the computation power shortage by improving the utilization of GPGPUs when running deep learning workloads. It presents a kernel-centric optimization approach with a focus on mapping neural networks to a more efficient set of kernels (parallel functions executed on GPGPUs) that ensures better utilization. This involves optimizations from multiple levels: algorithm level aiming to leverage more hardware-friendly formulations, operator level to harness on-chip high bandwidth on GPGPUs, and kernel implementation level that maximizes the utilization of computational resources.
May 26, 2024 by hgpu