BIDMach: Large-scale Learning with Zero Memory Allocation
Computer Science Division, UC Berkeley, Berkeley, CA 94720
Big Learning: Advances in Algorithms and Data Management, 2013
@article{canny2013bidmach,
title={BIDMach: Large-scale Learning with Zero Memory Allocation},
author={Canny, John and Zhao, Huasha},
year={2013}
}
This paper describes recent work on the BIDMach toolkit for large-scale machine learning. BIDMach has demonstrated single-node performance that exceeds that of published cluster systems for many common machine-learning task. BIDMach makes full use of both CPU and GPU acceleration (through a sister library BIDMat), and requires only modest hardware (commodity GPUs). One of the challenges of reaching this level of performance is the allocation barrier. While it is simple and expedient to allocate and recycle matrix (or graph) objects in expressions, this approach is too slow to match the arithmetic throughput possible on either GPUs or CPUs. In this paper we describe a caching approach that allows code with complex matrix (graph) expressions to run at massive scale, i.e. multi-terabyte data, with zero memory allocation after initial start-up. We present a number of new benchmarks that leverage this approach.
December 25, 2013 by hgpu