high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Toward Accurate Platform-Aware Performance Modeling for Deep Neural Networks

Toward Accurate Platform-Aware Performance Modeling for Deep Neural Networks

Chuan-Chi Wang, Ying-Chiao Liao, Ming-Chang Kao, Wen-Yew Liang, Shih-Hao Hung

ADLINK Technology Inc., New Taipei, Taiwan

arXiv:2012.00211 [cs.LG], (1 Dec 2020)

@misc{wang2020accurate,

title={Toward Accurate Platform-Aware Performance Modeling for Deep Neural Networks},

author={Chuan-Chi Wang and Ying-Chiao Liao and Ming-Chang Kao and Wen-Yew Liang and Shih-Hao Hung},

year={2020},

eprint={2012.00211},

archivePrefix={arXiv},

primaryClass={cs.LG}

}

Download (PDF)

View

Source

1997

views

In this paper, we provide a fine-grain machine learning-based method, PerfNetV2, which improves the accuracy of our previous work for modeling the neural network performance on a variety of GPU accelerators. Given an application, the proposed method can be used to predict the inference time and training time of the convolutional neural networks used in the application, which enables the system developer to optimize the performance by choosing the neural networks and/or incorporating the hardware accelerators to deliver satisfactory results in time. Furthermore, the proposed method is capable of predicting the performance of an unseen or non-existing device, e.g. a new GPU which has a higher operating frequency with less processor cores, but more memory capacity. This allows a system developer to quickly search the hardware design space and/or fine-tune the system configuration. Compared to the previous works, PerfNetV2 delivers more accurate results by modeling detailed host-accelerator interactions in executing the full neural networks and improving the architecture of the machine learning model used in the predictor. Our case studies show that PerfNetV2 yields a mean absolute percentage error within 13.1% on LeNet, AlexNet, and VGG16 on NVIDIA GTX-1080Ti, while the error rate on a previous work published in ICBD 2018 could be as large as 200%.

Tags: Benchmarking, Computer science, CUDA, Deep learning, Machine learning, Neural networks, nVidia, nVidia GeForce GTX 1080 Ti, Performance

December 6, 2020 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

high performance computing on graphics processing units: hgpu.org

Toward Accurate Platform-Aware Performance Modeling for Deep Neural Networks

Your response

Recent source codes

Interleaved Learning and Exploration: A Self-Adaptive Fuzz Testing Framework for MLIR

Pinocchio: PINpointing Orbit Crossing Collapsed Hierarchical Objects

KernelCoder: trained on a curated dataset of reasoning traces and CUDA kernel pairs

VibeCodeHPC - Multi Agentic Vibe Coding for HPC

Compile-Time Resource Safety for GPU APIs: A Low-Overhead Typestate Framework

exa-AMD: Exascale Accelerated Materials Discovery

TRUST: a thermalhydraulic software package for CFD simulations

Modular: The Modular Platform (includes MAX & Mojo)

Allo: Accelerator Design Language

Towards Robust Agentic CUDA Kernel Benchmarking, Verification, and Optimization

Most viewed papers (last 30 days)

Toward Accurate Platform-Aware Performance Modeling for Deep Neural Networks

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)