An End-to-End Programming Model for AI Engine Architectures
The University of Chicago
The University of Chicago, 2024
@article{levental2024end,
title={An End-to-End Programming Model for AI Engine Architectures},
author={Levental, Maksim},
year={2024},
publisher={The University of Chicago}
}
The proliferation of deep learning in various domains has led to remarkable advancements in artificial intelligence applications, such as large-language models for scientific use cases. However, the concomitant exponential growth in computational demands, driven by the development of ever-larger deep learning models, presents significant challenges in terms of resource consumption and sustainability. This dissertation addresses these escalating computational costs by investigating how the complexity of deep learning frameworks’ and their abstractions can significantly impact resource usage. In order to mitigate these growing costs, this dissertation presents novel insights into memory planning, high-level synthesis, lightweight frontend development, and end-to-end programming models for accelerator architectures. These insights culminate in the design and implementation of an embedded domain-specific language (eDSL) tailored to deep learning development on a novel accelerator, specifically the AMD AI Engine architecture. By prioritizing access to low-level APIs, leveraging compiler technologies, and rigorous mathematical models, the eDSL demonstrates the feasibility of achieving performant deep learning implementations while maintaining productivity in the design and exploration of deep learning methods.
June 23, 2024 by hgpu