28639

Memory Efficient Mixed-Precision Optimizers

Basile Lewandowski, Atli Kosson
Machine Learning Optimization laboratory, Ecole Polytechnique Federale de Lausanne
arXiv:2309.12381 [cs.LG], (21 Sep 2023)

@misc{lewandowski2023memory,

   title={Memory Efficient Mixed-Precision Optimizers},

   author={Basile Lewandowski and Atli Kosson},

   year={2023},

   eprint={2309.12381},

   archivePrefix={arXiv},

   primaryClass={cs.LG}

}

Download Download (PDF)   View View   Source Source   

564

views

Traditional optimization methods rely on the use of single-precision floating point arithmetic, which can be costly in terms of memory size and computing power. However, mixed precision optimization techniques leverage the use of both single and half-precision floating point arithmetic to reduce memory requirements while maintaining model accuracy. We provide here an algorithm to further reduce memory usage during the training of a model by getting rid of the floating point copy of the parameters, virtually keeping only half-precision numbers. We also explore the benefits of getting rid of the gradient’s value by executing the optimizer step during the back-propagation. In practice, we achieve up to 25% lower peak memory use and 15% faster training while maintaining the same level of accuracy.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: