Federated Weakly Supervised Video Anomaly Detection with Multimodal Prompt

Authors

  • Benfeng Wang Sun Yat-Sen University
  • Chao Huang Sun Yat-Sen University
  • Jie Wen Harbin Institute of Technology
  • Wei Wang Sun Yat-Sen University
  • Yabo Liu Harbin Institute of Technology
  • Yong Xu Harbin Institute of Technology

DOI:

https://doi.org/10.1609/aaai.v39i20.35398

Abstract

Video anomaly detection (VAD) aims at locating the abnormal events in videos. Recently, the Weakly Supervised VAD has made great progress, which only requires video-level annotations when training. In practical applications, different institutions may have different types of abnormal videos. However, the abnormal videos cannot be circulated on the internet due to privacy protection. To train a more generalized anomaly detector that can identify various anomalies, it is reasonable to introduce federated learning into WSVAD. In this paper, we propose Global and Local Context-driven Federated Learning, a new paradigm for privacy protected weakly supervised video anomaly detection. Specifically, we utilize the vision-language association of CLIP to detect whether the video frame is abnormal. Instead of leveraging handcrafted text prompts for CLIP, we propose a text prompt generator. The generated prompt is simultaneously influenced by text and visual. On the one hand, the text provides global context related to anomaly, which improves the model's ability of generalization. On the other hand, the visual provides personalized local context because different clients may have videos with different types of anomalies or scenes. The generated prompt ensures global generalization while processing personalized data from different clients. Extensive experiments show that the proposed method achieves remarkable performance.

Downloads

Published

2025-04-11

How to Cite

Wang, B., Huang, C., Wen, J., Wang, W., Liu, Y., & Xu, Y. (2025). Federated Weakly Supervised Video Anomaly Detection with Multimodal Prompt. Proceedings of the AAAI Conference on Artificial Intelligence, 39(20), 21017–21025. https://doi.org/10.1609/aaai.v39i20.35398

Issue

Section

AAAI Technical Track on Machine Learning VI