Self-Supervised Learning for Multilevel Skeleton-Based Forgery Detection via Temporal-Causal Consistency of Actions
DOI:
https://doi.org/10.1609/aaai.v37i1.25163Keywords:
CV: Motion & Tracking, CV: Adversarial Attacks & Robustness, CV: Biometrics, Face, Gesture & Pose, DMKM: Anomaly/Outlier Detection, KRR: Action, Change, and Causality, ML: Unsupervised & Self-Supervised LearningAbstract
Skeleton-based human action recognition and analysis have become increasingly attainable in many areas, such as security surveillance and anomaly detection. Given the prevalence of skeleton-based applications, tampering attacks on human skeletal features have emerged very recently. In particular, checking the temporal inconsistency and/or incoherence (TII) in the skeletal sequence of human action is a principle of forgery detection. To this end, we propose an approach to self-supervised learning of the temporal causality behind human action, which can effectively check TII in skeletal sequences. Especially, we design a multilevel skeleton-based forgery detection framework to recognize the forgery on frame level, clip level, and action level in terms of learning the corresponding temporal-causal skeleton representations for each level. Specifically, a hierarchical graph convolution network architecture is designed to learn low-level skeleton representations based on physical skeleton connections and high-level action representations based on temporal-causal dependencies for specific actions. Extensive experiments consistently show state-of-the-art results on multilevel forgery detection tasks and superior performance of our framework compared to current competing methods.Downloads
Published
2023-06-26
How to Cite
Hu, L., Liu, D. D., Zhang, Q., Naseem, U., & Lai, Z. Y. (2023). Self-Supervised Learning for Multilevel Skeleton-Based Forgery Detection via Temporal-Causal Consistency of Actions. Proceedings of the AAAI Conference on Artificial Intelligence, 37(1), 844-853. https://doi.org/10.1609/aaai.v37i1.25163
Issue
Section
AAAI Technical Track on Computer Vision I