Exploiting Invariance in Training Deep Neural Networks

Authors

  • Chengxi Ye Amazon Web Services
  • Xiong Zhou Amazon Web Services
  • Tristan McKinney Amazon Web Services
  • Yanfeng Liu Amazon Web Services
  • Qinggang Zhou Amazon Web Services
  • Fedor Zhdanov Amazon Web Services

DOI:

https://doi.org/10.1609/aaai.v36i8.20866

Keywords:

Machine Learning (ML), Computer Vision (CV), Search And Optimization (SO), Constraint Satisfaction And Optimization (CSO)

Abstract

Inspired by two basic mechanisms in animal visual systems, we introduce a feature transform technique that imposes invariance properties in the training of deep neural networks. The resulting algorithm requires less parameter tuning, trains well with an initial learning rate 1.0, and easily generalizes to different tasks. We enforce scale invariance with local statistics in the data to align similar samples at diverse scales. To accelerate convergence, we enforce a GL(n)-invariance property with global statistics extracted from a batch such that the gradient descent solution should remain invariant under basis change. Profiling analysis shows our proposed modifications takes 5% of the computations of the underlying convolution layer. Tested on convolutional networks and transformer networks, our proposed technique requires fewer iterations to train, surpasses all baselines by a large margin, seamlessly works on both small and large batch size training, and applies to different computer vision and language tasks.

Downloads

Published

2022-06-28

How to Cite

Ye, C., Zhou, X., McKinney, T., Liu, Y., Zhou, Q., & Zhdanov, F. (2022). Exploiting Invariance in Training Deep Neural Networks. Proceedings of the AAAI Conference on Artificial Intelligence, 36(8), 8849-8856. https://doi.org/10.1609/aaai.v36i8.20866

Issue

Section

AAAI Technical Track on Machine Learning III