FastFold: Reducing AlphaFold Training Time from 11 Days to 67 Hours
National University of Singapore
arXiv:2203.00854 [cs.LG], (2 Mar 2022)
@article{cheng2022fastfold,
title={FastFold: Reducing AlphaFold Training Time from 11 Days to 67 Hours},
author={Cheng, Shenggan and Wu, Ruidong and Yu, Zhongming and Li, Binrui and Zhang, Xiwen and Peng, Jian and You, Yang},
year={2022}
}
Protein structure prediction is an important method for understanding gene translation and protein function in the domain of structural biology. AlphaFold introduced the Transformer model to the field of protein structure prediction with atomic accuracy. However, training and inference of the AlphaFold model are time-consuming and expensive because of the special performance characteristics and huge memory consumption. In this paper, we propose FastFold, a highly efficient implementation of protein structure prediction model for training and inference. FastFold includes a series of GPU optimizations based on a thorough analysis of AlphaFold’s performance. Meanwhile, with Dynamic Axial Parallelism and Duality Async Operation, FastFold achieves high model parallelism scaling efficiency, surpassing existing popular model parallelism techniques. Experimental results show that FastFold reduces overall training time from 11 days to 67 hours and achieves 7.5-9.5x speedup for long-sequence inference. Furthermore, We scaled FastFold to 512 GPUs and achieved an aggregate of 6.02 PetaFLOPs with 90.1% parallel efficiency. The implementation is available.
March 6, 2022 by hgpu