https://hgpu.org/?p=18617
SuperNeurons: FFT-based Gradient Sparsification in the Distributed Training of Deep Neural Networks