No More Shortcuts: Realizing the Potential of Temporal Self-Supervision

Ishan Rajendrakumar Dave; Simon Jenni; Mubarak Shah

doi:10.1609/aaai.v38i2.27913

Authors

Ishan Rajendrakumar Dave University of Central Florida
Simon Jenni Adobe Research
Mubarak Shah University of Central Florida

DOI:

https://doi.org/10.1609/aaai.v38i2.27913

Keywords:

CV: Video Understanding & Activity Analysis, CV: Image and Video Retrieval, CV: Representation Learning for Vision, ML: Unsupervised & Self-Supervised Learning

Abstract

Self-supervised approaches for video have shown impressive results in video understanding tasks. However, unlike early works that leverage temporal self-supervision, current state-of-the-art methods primarily rely on tasks from the image domain (e.g., contrastive learning) that do not explicitly promote the learning of temporal features. We identify two factors that limit existing temporal self-supervision: 1) tasks are too simple, resulting in saturated training performance, and 2) we uncover shortcuts based on local appearance statistics that hinder the learning of high-level features. To address these issues, we propose 1) a more challenging reformulation of temporal self-supervision as frame-level (rather than clip-level) recognition tasks and 2) an effective augmentation strategy to mitigate shortcuts. Our model extends a representation of single video frames, pre-trained through contrastive learning, with a transformer that we train through temporal self-supervision. We demonstrate experimentally that our more challenging frame-level task formulations and the removal of shortcuts drastically improve the quality of features learned through temporal self-supervision. Our extensive experiments show state-of-the-art performance across 10 video understanding datasets, illustrating the generalization ability and robustness of our learned video representations. Project Page: https://daveishan.github.io/nms-webpage.

No More Shortcuts: Realizing the Potential of Temporal Self-Supervision

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information