Enhanced Audio Tagging via Multi- to Single-Modal Teacher-Student Mutual Learning

Yifang Yin; Harsh Shrivastava; Ying Zhang; Zhenguang Liu; Rajiv Ratn Shah; Roger Zimmermann

doi:10.1609/aaai.v35i12.17280

Authors

Yifang Yin National University of Singapore
Harsh Shrivastava National University of Singapore
Ying Zhang National University of Singapore Northwestern Polytechnical University, China
Zhenguang Liu Zhejiang Gongshang University
Rajiv Ratn Shah IIIT Delhi
Roger Zimmermann National University of Singapore

DOI:

https://doi.org/10.1609/aaai.v35i12.17280

Keywords:

Applications

Abstract

Recognizing ongoing events based on acoustic clues has been a critical yet challenging problem that has attracted significant research attention in recent years. Joint audio-visual analysis can improve the event detection accuracy but may not always be feasible as under many circumstances only audio recordings are available in real-world scenarios. To solve the challenges, we present a novel visual-assisted teacher-student mutual learning framework for robust sound event detection from audio recordings. Our model adopts a multi-modal teacher network based on both acoustic and visual clues, and a single-modal student network based on acoustic clues only. Conventional teacher-student learning performs unsatisfactorily for knowledge transfer from a multi-modality network to a single-modality network. We thus present a mutual learning framework by introducing a single-modal transfer loss and a cross-modal transfer loss to collaboratively learn the audio-visual correlations between the two networks. Our proposed solution takes the advantages of joint audio-visual analysis in training while maximizing the feasibility of the model in use cases. Our extensive experiments on the DCASE17 and the DCASE18 sound event detection datasets show that our proposed method outperforms the state-of-the-art audio tagging approaches.

Enhanced Audio Tagging via Multi- to Single-Modal Teacher-Student Mutual Learning

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information