high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Mystique: Enabling Accurate and Scalable Generation of Production AI Benchmarks

Mystique: Enabling Accurate and Scalable Generation of Production AI Benchmarks

Mingyu Liang, Wenyin Fu, Louis Feng, Zhongyi Lin, Pavani Panakanti, Shengbao Zheng, Srinivas Sridharan, Christina Delimitrou

Cornell University, Ithaca, New York, USA

Proceedings of the 50th Annual International Symposium on Computer Architecture (ISCA ’23), 2023

DOI:10.1145/3579371.3589072

@inproceedings{liang2023mystique,

title={Mystique: Enabling Accurate and Scalable Generation of Production AI Benchmarks},

author={Liang, Mingyu and Fu, Wenyin and Feng, Louis and Lin, Zhongyi and Panakanti, Pavani and Zheng, Shengbao and Sridharan, Srinivas and Delimitrou, Christina},

booktitle={Proceedings of the 50th Annual International Symposium on Computer Architecture},

pages={1–13},

year={2023}

}

Download (PDF)

View

Source

Source codes

Package:

PARAM: PArametrized Recommendation and Ai Model benchmark is a repository for development of numerous uBenchmarks as well as end to end nets for evaluation of training and inference platforms

1173

views

Building large AI fleets to support the rapidly growing DL workloads is an active research topic for modern cloud providers. Generating accurate benchmarks plays an essential role in designing the fast-paced software and hardware solutions in this space. Two fundamental challenges to make this scalable are (i) workload representativeness and (ii) the ability to quickly incorporate changes to the fleet into the benchmarks. To overcome these issues, we propose Mystique, an accurate and scalable framework for production AI benchmark generation. It leverages the PyTorch execution trace (ET), a new feature that captures the runtime information of AI models at the granularity of operators, in a graph format, together with their metadata. By sourcing fleet ETs, we can build AI benchmarks that are portable and representative. Mystique is scalable, due to its lightweight data collection, in terms of runtime overhead and instrumentation effort. It is also adaptive because ET composability allows flexible control on benchmark creation. We evaluate our methodology on several production AI models, and show that benchmarks generated with Mystique closely resemble original AI models, both in execution time and system-level metrics. We also showcase the portability of the generated benchmarks across platforms, and demonstrate several use cases enabled by the fine-grained composability of the execution trace.

Tags: AI, Benchmarking, Code generation, Computer science, CUDA, nVidia, Package, Performance, PyTorch, Tesla A100, Tesla V100

July 16, 2023 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

high performance computing on graphics processing units: hgpu.org

Mystique: Enabling Accurate and Scalable Generation of Production AI Benchmarks

Package:

Your response

Recent source codes

Interleaved Learning and Exploration: A Self-Adaptive Fuzz Testing Framework for MLIR

Pinocchio: PINpointing Orbit Crossing Collapsed Hierarchical Objects

KernelCoder: trained on a curated dataset of reasoning traces and CUDA kernel pairs

VibeCodeHPC - Multi Agentic Vibe Coding for HPC

Compile-Time Resource Safety for GPU APIs: A Low-Overhead Typestate Framework

exa-AMD: Exascale Accelerated Materials Discovery

TRUST: a thermalhydraulic software package for CFD simulations

Modular: The Modular Platform (includes MAX & Mojo)

Allo: Accelerator Design Language

Towards Robust Agentic CUDA Kernel Benchmarking, Verification, and Optimization

Most viewed papers (last 30 days)

Mystique: Enabling Accurate and Scalable Generation of Production AI Benchmarks

Package:

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)