EffConv: Efficient Learning of Kernel Sizes for Convolution Layers of CNNs

Alireza Ganjdanesh; Shangqian Gao; Heng Huang

doi:10.1609/aaai.v37i6.25923

Authors

Alireza Ganjdanesh University of Pittsburgh
Shangqian Gao University of Pittsburgh
Heng Huang University of Pittsburgh

DOI:

https://doi.org/10.1609/aaai.v37i6.25923

Keywords:

ML: Deep Neural Architectures, ML: Deep Neural Network Algorithms

Abstract

Determining kernel sizes of a CNN model is a crucial and non-trivial design choice and significantly impacts its performance. The majority of kernel size design methods rely on complex heuristic tricks or leverage neural architecture search that requires extreme computational resources. Thus, learning kernel sizes, using methods such as modeling kernels as a combination of basis functions, jointly with the model weights has been proposed as a workaround. However, previous methods cannot achieve satisfactory results or are inefficient for large-scale datasets. To fill this gap, we design a novel efficient kernel size learning method in which a size predictor model learns to predict optimal kernel sizes for a classifier given a desired number of parameters. It does so in collaboration with a kernel predictor model that predicts the weights of the kernels - given kernel sizes predicted by the size predictor - to minimize the training objective, and both models are trained end-to-end. Our method only needs a small fraction of the training epochs of the original CNN to train these two models and find proper kernel sizes for it. Thus, it offers an efficient and effective solution for the kernel size learning problem. Our extensive experiments on MNIST, CIFAR-10, STL-10, and ImageNet-32 demonstrate that our method can achieve the best training time vs. accuracy trade-off compared to previous kernel size learning methods and significantly outperform them on challenging datasets such as STL-10 and ImageNet-32. Our implementations are available at https://github.com/Alii-Ganjj/EffConv.

EffConv: Efficient Learning of Kernel Sizes for Convolution Layers of CNNs

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Subscription