Comparative Study of Caffe, Neon, Theano, and Torch for Deep Learning

hgpu.org » Applications » Computer science » Comparative Study of Caffe, Neon, Theano, and Torch for Deep Learning

Comparative Study of Caffe, Neon, Theano, and Torch for Deep Learning

Soheil Bahrampour, Naveen Ramakrishnan, Lukas Schott, Mohak Shah

Bosch Research and Technology Center, North America

arXiv:1511.06435 [cs.LG], (19 Nov 2015)

BibTeX

Download (PDF)

View

Source

2974

views

Deep learning methods have resulted in significant performance improvements in several application domains and as such several software frameworks have been developed to facilitate their implementation. This paper presents a comparative study of four deep learning frameworks, namely Caffe, Neon, Theano, and Torch, on three aspects: extensibility, hardware utilization, and speed. The study is performed on several types of deep learning architectures and we evaluate the performance of the above frameworks when employed on a single machine for both (multi-threaded) CPU and GPU (Nvidia Titan X) settings. The speed performance metrics used here include the gradient computation time, which is important during the training phase of deep networks, and the forward time, which is important from the deployment perspective of trained networks. For convolutional networks, we also report how each of these frameworks support various convolutional algorithms and their corresponding performance. From our experiments, we observe that Theano and Torch are the most easily extensible frameworks. We observe that Torch is best suited for any deep architecture on CPU, followed by Theano. It also achieves the best performance on the GPU for large convolutional and fully connected networks, followed closely by Neon. Theano achieves the best performance on GPU for training and deployment of LSTM networks. Finally Caffe is the easiest for evaluating the performance of standard deep architectures.

Tags: Caffe, Computer science, CUDA, Deep learning, Machine learning, nVidia, nVidia GeForce GTX Titan X

November 24, 2015 by hgpu

Rating: 1.5/5. From 4 votes.

Please wait...

high performance computing on graphics processing units: hgpu.org

Comparative Study of Caffe, Neon, Theano, and Torch for Deep Learning

Recent source codes

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

SYCL Container

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

CFAL-bench

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

Most viewed papers (last 30 days)

Comparative Study of Caffe, Neon, Theano, and Torch for Deep Learning

Share this:

Recent source codes

Most viewed papers (last 30 days)