Array Languages Make Neural Networks Fast
Heriot-Watt University, Edinburgh, Scotland, UK
arXiv:1912.05234 [cs.PL], (11 Dec 2019)
@misc{inkarovs2019array,
title={Array Languages Make Neural Networks Fast},
author={Artjoms Šinkarovs and Hans-Nikolai Vießmann and Sven-Bodo Scholz},
year={2019},
eprint={1912.05234},
archivePrefix={arXiv},
primaryClass={cs.PL}
}
Modern machine learning frameworks are complex: they are typically organised in multiple layers each of which is written in a different language and they depend on a number of external libraries, but at their core they mainly consist of tensor operations. As array-oriented languages provide perfect abstractions to implement tensor operations, we consider a minimalistic machine learning framework that is shallowly embedded in an array-oriented language and we study its productivity and performance. We do this by implementing a state of the art Convolutional Neural Network (CNN) and compare it against implementations in TensorFlow and PyTorch — two state of the art industrial-strength frameworks. It turns out that our implementation is 2 and 3 times faster, even after fine-tuning the TensorFlow and PyTorch to our hardware — a 64-core GPU-accelerated machine. The size of all three CNN specifications is the same, about 150 lines of code. Our mini framework is 150 lines of highly reusable hardware-agnostic code that does not depend on external libraries. The compiler for a host array language automatically generates parallel code for a chosen architecture. The key to such a balance between performance and portability lies in the design of the array language; in particular, the ability to express rank-polymorphic operations concisely, yet being able to do optimisations across them. This design builds on very few assumptions, and it is readily transferable to other contexts offering a clean approach to high-performance machine learning.
December 15, 2019 by hgpu