Towards Efficient and Scalable Acceleration of Online Decision Tree Learning on FPGA

hgpu.org » Applications » Computer science » Towards Efficient and Scalable Acceleration of Online Decision Tree Learning on FPGA

Towards Efficient and Scalable Acceleration of Online Decision Tree Learning on FPGA

Zhe Lin, Sharad Sinha, Wei Zhang

Hong Kong University of Science and Technology, Hong Kong

arXiv:2009.01431 [cs.LG], (3 Sep 2020)

DOI:10.1109/FCCM.2019.00032

@article{Lin_2019,

title={Towards Efficient and Scalable Acceleration of Online Decision Tree Learning on FPGA},

ISBN={9781728111315},

url={http://dx.doi.org/10.1109/FCCM.2019.00032},

DOI={10.1109/fccm.2019.00032},

journal={2019 IEEE 27th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)},

publisher={IEEE},

author={Lin, Zhe and Sinha, Sharad and Zhang, Wei},

year={2019},

month={Apr}

}

Download (PDF)

View

Source

1500

views

Decision trees are machine learning models commonly used in various application scenarios. In the era of big data, traditional decision tree induction algorithms are not suitable for learning large-scale datasets due to their stringent data storage requirement. Online decision tree learning algorithms have been devised to tackle this problem by concurrently training with incoming samples and providing inference results. However, even the most up-to-date online tree learning algorithms still suffer from either high memory usage or high computational intensity with dependency and long latency, making them challenging to implement in hardware. To overcome these difficulties, we introduce a new quantile-based algorithm to improve the induction of the Hoeffding tree, one of the state-of-the-art online learning models. The proposed algorithm is light-weight in terms of both memory and computational demand, while still maintaining high generalization ability. A series of optimization techniques dedicated to the proposed algorithm have been investigated from the hardware perspective, including coarse-grained and fine-grained parallelism, dynamic and memory-based resource sharing, pipelining with data forwarding. We further present a high-performance, hardware-efficient and scalable online decision tree learning system on a field-programmable gate array (FPGA) with system-level optimization techniques. Experimental results show that our proposed algorithm outperforms the state-of-the-art Hoeffding tree learning method, leading to 0.05% to 12.3% improvement in inference accuracy. Real implementation of the complete learning system on the FPGA demonstrates a 384x to 1581x speedup in execution time over the state-of-the-art design.

Tags: Computer science, FPGA, Machine learning

September 6, 2020 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org