Performance benchmarking of deep learning framework on Intel Xeon Phi

hgpu.org » Applications » Computer science » Performance benchmarking of deep learning framework on Intel Xeon Phi

Performance benchmarking of deep learning framework on Intel Xeon Phi

Chao-Tung Yang, Jung-Chun Liu, Yu-Wei Chan, Endah Kristiani, Chan-Fu Kuo

Department of Computer Science, Tunghai University, Taichung 40704, Taiwan, ROC

The Journal of Supercomputing (2020)

DOI:10.1007/s11227-020-03362-3

BibTeX

Download (PDF)

View

Source

1990

views

With the success of deep learning (DL) methods in diverse application domains, several deep learning software frameworks have been proposed to facilitate the usage of these methods. By knowing the frameworks which are employed in big data analysis, the analysis process will be more efficient in terms of time and accuracy. Thus, benchmarking DL software frameworks is in high demand. This paper presents a comparative study of deep learning frameworks, namely Caffe and TensorFlow on performance metrics: runtime performance and accuracy. This study is performed with several datasets, such as LeNet MNIST classification model, CIFAR-10 image recognition datasets and message passing interface (MPI) parallel matrix-vector multiplication. We evaluate the performance of the above frameworks when employed on machines of Intel Xeon Phi 7210. In this study, the use of vectorization, OpenMP parallel processing, and MPI are examined to improve the performance of deep learning frameworks. The experimental results show the accuracy comparison between the number of iterations of the test in the training model and the training time on the different machines before and after optimization. In addition, an experiment on two multi-nodes of Xeon Phi is performed. The experimental results also show the optimization of Xeon Phi is beneficial to the Caffe and TensorFlow frameworks.

Tags: Benchmarking, Caffe, Computer science, Deep learning, Image recognition, Intel Xeon Phi, MPI, OpenMP, Performance, TensorFlow

June 28, 2020 by hgpu

No votes yet.

Please wait...

high performance computing on graphics processing units: hgpu.org

Performance benchmarking of deep learning framework on Intel Xeon Phi

Recent source codes

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

SYCL Container

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

CFAL-bench

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

Most viewed papers (last 30 days)

Performance benchmarking of deep learning framework on Intel Xeon Phi

Share this:

Recent source codes

Most viewed papers (last 30 days)