https://hgpu.org/?p=25024
CoCoNet: Co-Optimizing Computation and Communication for Distributed Machine Learning