high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » An End-to-End Programming Model for AI Engine Architectures

An End-to-End Programming Model for AI Engine Architectures

Maksim Levental

The University of Chicago

The University of Chicago, 2024

DOI:10.6082/uchicago.12382

BibTeX

Download (PDF)

View

Source

834

views

The proliferation of deep learning in various domains has led to remarkable advancements in artificial intelligence applications, such as large-language models for scientific use cases. However, the concomitant exponential growth in computational demands, driven by the development of ever-larger deep learning models, presents significant challenges in terms of resource consumption and sustainability. This dissertation addresses these escalating computational costs by investigating how the complexity of deep learning frameworks’ and their abstractions can significantly impact resource usage. In order to mitigate these growing costs, this dissertation presents novel insights into memory planning, high-level synthesis, lightweight frontend development, and end-to-end programming models for accelerator architectures. These insights culminate in the design and implementation of an embedded domain-specific language (eDSL) tailored to deep learning development on a novel accelerator, specifically the AMD AI Engine architecture. By prioritizing access to low-level APIs, leveraging compiler technologies, and rigorous mathematical models, the eDSL demonstrates the feasibility of achieving performant deep learning implementations while maintaining productivity in the design and exploration of deep learning methods.

Tags: Artificial intelligence, Computer science, Deep learning, nVidia, nVidia GeForce RTX 3080 Ti, Thesis, Vulkan

June 23, 2024 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

An End-to-End Programming Model for AI Engine Architectures

Your response

Recent source codes

Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

KISim: Kubernetes Intelligent Scheduling Simulator

Efficient GPU Implementation of Multi-Precision Integer Division

exa-AMD: Exascale Accelerated Materials Discovery

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Most viewed papers (last 30 days)

An End-to-End Programming Model for AI Engine Architectures

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)