[1]

C.-Y. Chen, J. Choi, D. Brand, A. Agrawal, W. Zhang, and K. Gopalakrishnan, “AdaComp : Adaptive Residual Gradient Compression for Data-Parallel Distributed Training”, AAAI, vol. 32, no. 1, Apr. 2018.