Dragon-Alpha&cu32: A Java-based Tensor Computing Framework With its High-Performance CUDA Library

Zhiyi Zhang, Pengfei Zhang, Qi Wang
Hefei Institutes of Physical Science, Chinese Academy of Sciences
arXiv:2305.08819 [cs.LG], (15 May 2023)


   title={Dragon-Alpha&cu32: A Java-based Tensor Computing Framework With its High-Performance CUDA Library},

   author={Zhiyi Zhang and Pengfei Zhang and Qi Wang},






Java is very powerful, but in Deep Learning field, its capabilities probably has not been sufficiently exploited. Compared to the Java-based deep-learning-frameworks, the Python-based (PyTorch, TensorFlow, etc) are undoubtedly the mainstream, due to their easy-to-use, flexibility and better ecosystem. Dragon-Alpha is a Java-based Tensor Computing Framework, with easy-to-use, high-scalability and high-performance, trying to break Java’s dilemma in deep learning field and make it more effective. Dragon-Alpha supports different levels of APIs, and can be used as a deep-learning-framework through its user-friendly high-level APIs. Dragon-Alpha has potential to aggregate computing-power across heterogeneous platforms and devices, based on its multi-layer architecture and Java’s big-data ecosystem. Dragon-Alpha has its asynchronized APIs to improve parallelism, and highly-optimized CUDA library cu32 which adopts unique convolutiondeconvolution operators for small feature maps. The experiments show that, compared to PyTorch&cuDNN, Dragon-Alpha&cu32 costs less time and memory (75.38% to 97.32%, 29.2% to 66.4%), to train some typical neural networks (AlexNet, VGG, GoogleNet, ResNet) on Cifar-10.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2023 hgpu.org

All rights belong to the respective authors

Contact us: