Dragon-Alpha&cu32: A Java-based Tensor Computing Framework With its High-Performance CUDA Library
Hefei Institutes of Physical Science, Chinese Academy of Sciences
arXiv:2305.08819 [cs.LG], (15 May 2023)
@misc{zhang2023dragonalphacu32,
title={Dragon-Alpha&cu32: A Java-based Tensor Computing Framework With its High-Performance CUDA Library},
author={Zhiyi Zhang and Pengfei Zhang and Qi Wang},
year={2023},
eprint={2305.08819},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
Java is very powerful, but in Deep Learning field, its capabilities probably has not been sufficiently exploited. Compared to the Java-based deep-learning-frameworks, the Python-based (PyTorch, TensorFlow, etc) are undoubtedly the mainstream, due to their easy-to-use, flexibility and better ecosystem. Dragon-Alpha is a Java-based Tensor Computing Framework, with easy-to-use, high-scalability and high-performance, trying to break Java’s dilemma in deep learning field and make it more effective. Dragon-Alpha supports different levels of APIs, and can be used as a deep-learning-framework through its user-friendly high-level APIs. Dragon-Alpha has potential to aggregate computing-power across heterogeneous platforms and devices, based on its multi-layer architecture and Java’s big-data ecosystem. Dragon-Alpha has its asynchronized APIs to improve parallelism, and highly-optimized CUDA library cu32 which adopts unique convolutiondeconvolution operators for small feature maps. The experiments show that, compared to PyTorch&cuDNN, Dragon-Alpha&cu32 costs less time and memory (75.38% to 97.32%, 29.2% to 66.4%), to train some typical neural networks (AlexNet, VGG, GoogleNet, ResNet) on Cifar-10.
May 21, 2023 by hgpu