CAKES: Channel-wise Automatic KErnel Shrinking for Efficient 3D Networks
DOI:
https://doi.org/10.1609/aaai.v35i4.16433Keywords:
Applications, Healthcare, Medicine & Wellness, Video Understanding & Activity Analysis, SegmentationAbstract
3D Convolution Neural Networks (CNNs) have been widely applied to 3D scene understanding, such as video analysis and volumetric image recognition. However, 3D networks can easily lead to over-parameterization which incurs expensive computation cost. In this paper, we propose Channel-wise Automatic KErnel Shrinking (CAKES), to enable efficient 3D learning by shrinking standard 3D convolutions into a set of economic operations (e.g., 1D, 2D convolutions). Unlike previous methods, CAKES performs channel-wise kernel shrinkage, which enjoys the following benefits: 1) enabling operations deployed in every layer to be heterogeneous, so that they can extract diverse and complementary information to benefit the learning process; and 2) allowing for an efficient and flexible replacement design, which can be generalized to both spatial-temporal and volumetric data. Further, we propose a new search space based on CAKES, so that the configuration can be determined automatically for simplifying 3D networks. CAKES shows superior performance to other methods with similar model size, and it also achieves comparable performance to state-of-the-art methods with much fewer parameters and computational costs on tasks including 3D medical imaging segmentation and video action recognition. Codes and models are available at https://github.com/yucornetto/CAKESDownloads
Published
2021-05-18
How to Cite
Yu, Q., Li, Y., Mei, J., Zhou, Y., & Yuille, A. (2021). CAKES: Channel-wise Automatic KErnel Shrinking for Efficient 3D Networks. Proceedings of the AAAI Conference on Artificial Intelligence, 35(4), 3225-3233. https://doi.org/10.1609/aaai.v35i4.16433
Issue
Section
AAAI Technical Track on Computer Vision III