27528

TorchOpt: An Efficient Library for Differentiable Optimization

Jie Ren, Xidong Feng, Bo Liu, Xuehai Pan, Yao Fu, Luo Mai, Yaodong Yang
University of Edinburgh, Edinburgh, United Kingdom
arXiv:2211.06934 [cs.MS], (13 Nov 2022)

@misc{https://doi.org/10.48550/arxiv.2211.06934,

   doi={10.48550/ARXIV.2211.06934},

   url={https://arxiv.org/abs/2211.06934},

   author={Ren, Jie and Feng, Xidong and Liu, Bo and Pan, Xuehai and Fu, Yao and Mai, Luo and Yang, Yaodong},

   keywords={Mathematical Software (cs.MS), Artificial Intelligence (cs.AI), Distributed, Parallel, and Cluster Computing (cs.DC), Machine Learning (cs.LG), Optimization and Control (math.OC), FOS: Computer and information sciences, FOS: Computer and information sciences, FOS: Mathematics, FOS: Mathematics},

   title={TorchOpt: An Efficient Library for Differentiable Optimization},

   publisher={arXiv},

   year={2022},

   copyright={arXiv.org perpetual, non-exclusive license}

}

Recent years have witnessed the booming of various differentiable optimization algorithms. These algorithms exhibit different execution patterns, and their execution needs massive computational resources that go beyond a single CPU and GPU. Existing differentiable optimization libraries, however, cannot support efficient algorithm development and multi-CPU/GPU execution, making the development of differentiable optimization algorithms often cumbersome and expensive. This paper introduces TorchOpt, a PyTorch-based efficient library for differentiable optimization. TorchOpt provides a unified and expressive differentiable optimization programming abstraction. This abstraction allows users to efficiently declare and analyze various differentiable optimization programs with explicit gradients, implicit gradients, and zero-order gradients. TorchOpt further provides a high-performance distributed execution runtime. This runtime can fully parallelize computation-intensive differentiation operations (e.g. tensor tree flattening) on CPUs / GPUs and automatically distribute computation to distributed devices. Experimental results show that TorchOpt achieves 5.2× training time speedup on an 8-GPU server. TorchOpt is available.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: