Lottery Pools: Winning More by Interpolating Tickets without Increasing Training or Inference Cost

Authors

  • Lu Yin Eindhoven University of Technology
  • Shiwei Liu Eindhoven University of Technology University of Texas at Austin
  • Meng Fang University of Liverpool
  • Tianjin Huang Eindhoven University of Technology
  • Vlado Menkovski Eindhoven University of Technology
  • Mykola Pechenizkiy Eindhoven University of Technology

DOI:

https://doi.org/10.1609/aaai.v37i9.26297

Keywords:

ML: Learning on the Edge & Model Compression, ML: Deep Neural Network Algorithms, ML: Ensemble Methods

Abstract

Lottery tickets (LTs) is able to discover accurate and sparse subnetworks that could be trained in isolation to match the performance of dense networks. Ensemble, in parallel, is one of the oldest time-proven tricks in machine learning to improve performance by combining the output of multiple independent models. However, the benefits of ensemble in the context of LTs will be diluted since ensemble does not directly lead to stronger sparse subnetworks, but leverages their predictions for a better decision. In this work, we first observe that directly averaging the weights of the adjacent learned subnetworks significantly boosts the performance of LTs. Encouraged by this observation, we further propose an alternative way to perform an "ensemble'' over the subnetworks identified by iterative magnitude pruning via a simple interpolating strategy. We call our method Lottery Pools. In contrast to the naive ensemble which brings no performance gains to each single subnetwork, Lottery Pools yields much stronger sparse subnetworks than the original LTs without requiring any extra training or inference cost. Across various modern architectures on CIFAR-10/100 and ImageNet, we show that our method achieves significant performance gains in both, in-distribution and out-of-distribution scenarios. Impressively, evaluated with VGG-16 and ResNet-18, the produced sparse subnetworks outperform the original LTs by up to 1.88% on CIFAR-100 and 2.36% on CIFAR-100-C; the resulting dense network surpasses the pre-trained dense-model up to 2.22% on CIFAR-100 and 2.38% on CIFAR-100-C. Our source code can be found at https://github.com/luuyin/Lottery-pools.

Downloads

Published

2023-06-26

How to Cite

Yin, L., Liu, S., Fang, M., Huang, T., Menkovski, V., & Pechenizkiy, M. (2023). Lottery Pools: Winning More by Interpolating Tickets without Increasing Training or Inference Cost. Proceedings of the AAAI Conference on Artificial Intelligence, 37(9), 10945-10953. https://doi.org/10.1609/aaai.v37i9.26297

Issue

Section

AAAI Technical Track on Machine Learning IV