WingBeats and Snapshots: Fusing Sound and Vision for Mosquito Monitoring (Student Abstract)
DOI:
https://doi.org/10.1609/aaai.v40i48.42196Abstract
Accurate identification of mosquito species is crucial for controlling vector-borne diseases, yet visual or acoustic methods alone are often insufficient. We propose a multimodal deep-learning framework that combines high-resolution images with wingbeat audio using a SwinV2 vision transformer and an Audio Spectrogram Transformer, thereby capturing complementary cues. On a six-species dataset, it achieves 97% accuracy, comparable to the best single-modality baseline, and is designed to improve robustness under noise or environmental variation, demonstrating the value of integrating multiple data sources for reliable mosquito surveillance.Downloads
Published
2026-03-14
How to Cite
Chanda, A., & Agarwal, A. (2026). WingBeats and Snapshots: Fusing Sound and Vision for Mosquito Monitoring (Student Abstract). Proceedings of the AAAI Conference on Artificial Intelligence, 40(48), 41154–41156. https://doi.org/10.1609/aaai.v40i48.42196
Issue
Section
AAAI Student Abstract and Poster Program