Reveal training performance mystery between TensorFlow and PyTorch in the single GPU environment

hgpu.org » Applications » Computer science » Reveal training performance mystery between TensorFlow and PyTorch in the single GPU environment

Reveal training performance mystery between TensorFlow and PyTorch in the single GPU environment

Hulin Dai, Xuan Peng, Xuanhua Shi, Ligang He, Qian Xiong, Hai Jin

National Engineering Research Center for Big Data Technology and System, Service Computing Technology and System Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China

Sci China Inf Sci, 65(1): 112103, 2022

DOI:10.1007/s11432-020-3182-1

BibTeX

Download (PDF)

View

Source

1339

views

Deep learning has gained tremendous success in various fields while training deep neural networks (DNNs) is very compute-intensive, which results in numerous deep learning frameworks that aim to offer better usability and higher performance to deep learning practitioners. TensorFlow and PyTorch are the two most popular frameworks. TensorFlow is more promising within the industry context, while PyTorch is more appealing in academia. However, these two frameworks differ much owing to the opposite design philosophy: static vs dynamic computation graph. TensorFlow is regarded as being more performance-friendly as it has more opportunities to perform optimizations with the full view of the computation graph. However, there are also claims that PyTorch is faster than TensorFlow sometimes, which confuses the end-users on the choice between them. In this paper, we carry out the analytical and experimental analysis to unravel the mystery of comparison in training speed on single-GPU between TensorFlow and PyTorch. To ensure that our investigation is as comprehensive as possible, we carefully select seven popular neural networks, which cover computer vision, speech recognition, and natural language processing (NLP). The contributions of this work are two-fold. First, we conduct the detailed benchmarking experiments on TensorFlow and PyTorch and analyze the reasons for their performance difference. This work provides the guidance for the end-users to choose between these two frameworks. Second, we identify some key factors that affect the performance, which can direct the end-users to write their models more efficiently.

Tags: Benchmarking, Computer science, CUDA, Deep learning, Neural networks, nVidia, Performance, PyTorch, TensorFlow, Tesla P100

January 9, 2022 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org