https://hgpu.org/?p=23902
Out-of-core Training for Extremely Large-Scale Neural Networks With Adaptive Window-Based Scheduling