General-Purpose Computing on Tensor Processors
University of California, Riverside
University of California, Riverside, 2024
@phdthesis{hsu2024general,
title={General-Purpose Computing on Tensor Processors},
author={Hsu, Kuan-Chieh},
year={2024},
school={University of California, Riverside}
}
Modern computer systems have become heterogeneous and consist of many emerging kinds of hardware accelerators as Dennard Scaling discontinues. Also, such domain-specific hardware accelerators fulfill the rapidly growing computing demands for applications including artificial intelligence (AI) and machine learning (ML). Beyond conventional computer components such as central processing units (CPUs) and memory, modern computers typically contain accelerators such as graphic processing units (GPUs), tensor processing units (TPUs), and neural processing units (NPUs). Although accelerators have various programming interfaces and execution models, a group of accelerators are tensor processors that improve system performance for any problem that uses matrix or tensors as input and/or outputs. Despite the differences among the microarchitectural designs of each, tensor processors essentially are hardware accelerators that focus on providing efficient matrix-based computation solutions. In this dissertation, we envision a new programming paradigm that leverages tensor processors for general-purpose computing beyond the original application domains for AI and ML. The framework should contain the following characteristics. First, the programming interface design for a heterogeneous system with tensor processors must be simple, easy to use, and can maintain great compatibility and portability across various systems. Second, the execution model of the framework should intelligently explore and exploit opportunities by using the tensor processors that deliver better performance and extend the spectrum of application domains. Finally, the framework solution must be cost-effective and energy-efficient and be able to accommodate algorithm redesign and transformation that support broader usages. I proposed three research works in response to the envision. First, I proposed GPTPU, an open-source, open-architecture framework that allows users to explore the usage opportunity of tensor processors for general applications. Second, I proposed SHMT, a new programming and execution model that enables simultaneous parallel processing using heterogeneous processing units for the same function. Lastly, I proposed GSLD, a matrix computing library accommodating either dense or sparse matrix inputs that more intelligently uses dense matrix processors and scalar cores.
October 27, 2024 by hgpu