Towards Explainable Video Camouflaged Object Detection: SAM2 with Eventstream-Inspired Data
DOI:
https://doi.org/10.1609/aaai.v40i15.38245Abstract
Video Camouflaged Object Detection (VCOD) poses significant challenges due to the subtle appearance of camouflaged objects, especially under dynamic motion and occlusion. Existing methods predominantly rely on optical flow or black-box features for motion modeling, which often entail substantial computational costs and suffer from limited interpretability. Inspired by the human strategy of identifying abnormal movements between frames and the principle of event camera image formation, we propose an eventstream-inspired dual-branch framework for VCOD. Specifically, we design an eventstream-like data extraction module to capture pixel-level motion variations, effectively distinguishing object motion from background dynamics. This event-based representation is integrated into SAM2 through a dual-branch memory-augmented framework, consisting of Time Bridge Attention and Visual Bridge Attention, enabling joint modeling of motion and appearance cues. In addition, we introduce a Prompt Embedding Generator to eliminate the need for human-provided interactive prompts, facilitating fully automatic VCOD. Extensive experiments on MoCA-Mask and CAD2016 demonstrate that our approach significantly outperforms state-of-the-art methods, achieving both superior segmentation accuracy and interpretable motion modeling. To the best of our knowledge, this is the first work to incorporate eventstream-inspired representations into the VCOD task.Downloads
Published
2026-03-14
How to Cite
Zhang, H., Lyu, Y., Liu, H., Song, J., Yuan, D., & Yang, Y. (2026). Towards Explainable Video Camouflaged Object Detection: SAM2 with Eventstream-Inspired Data. Proceedings of the AAAI Conference on Artificial Intelligence, 40(15), 12511-12519. https://doi.org/10.1609/aaai.v40i15.38245
Issue
Section
AAAI Technical Track on Computer Vision XII