TY - JOUR AU - Kuo, I-Yuan AU - Wei, Wen-Li AU - Lin, Jen-Chun PY - 2021/05/18 Y2 - 2024/03/28 TI - Positions, Channels, and Layers: Fully Generalized Non-Local Network for Singer Identification JF - Proceedings of the AAAI Conference on Artificial Intelligence JA - AAAI VL - 35 IS - 9 SE - AAAI Technical Track on Machine Learning II DO - 10.1609/aaai.v35i9.17000 UR - https://ojs.aaai.org/index.php/AAAI/article/view/17000 SP - 8217-8225 AB - Recently, a non-local (NL) operation has been designed as the central building block for deep-net models to capture long-range dependencies (Wang et al. 2018). Despite its excellent performance, it does not consider the interaction between positions across channels and layers, which is crucial in fine-grained classification tasks. To address the limitation, we target at singer identification (SID) task and present a fully generalized non-local (FGNL) module to help identify fine-grained vocals. Specifically, we first propose a FGNL operation, which extends the NL operation to explore the correlations between positions across channels and layers. Secondly, we further apply a depth-wise convolution with Gaussian kernel in the FGNL operation to smooth feature maps for better generalization. More, we modify the squeeze-and-excitation (SE) scheme into the FGNL module to adaptively emphasize correlated feature channels to help uncover relevant feature responses and eventually the target singer. Evaluating results on the benchmark artist20 dataset shows that the FGNL module significantly improves the accuracy of the deep-net models in SID. Codes are available at https://github.com/ian-k-1217/Fully-Generalized-Non-Local-Network. ER -