DASZL: Dynamic Action Signatures for Zero-shot Learning

Authors

  • Tae Soo Kim Johns Hopkins University
  • Jonathan Jones Johns Hopkins University
  • Michael Peven Johns Hopkins University
  • Zihao Xiao Johns Hopkins University
  • Jin Bai Johns Hopkins University
  • Yi Zhang Johns Hopkins University
  • Weichao Qiu Johns Hopkins University
  • Alan Yuille Johns Hopkins University
  • Gregory D. Hager Johns Hopkins University

DOI:

https://doi.org/10.1609/aaai.v35i3.16276

Keywords:

Video Understanding & Activity Analysis

Abstract

There are many realistic applications of activity recognition where the set of potential activity descriptions is combinatorially large. This makes end-to-end supervised training of a recognition system impractical as no training set is practically able to encompass the entire label set. In this paper, we present an approach to fine-grained recognition that models activities as compositions of dynamic action signatures. This compositional approach allows us to reframe fine-grained recognition as zero-shot activity recognition, where a detector is composed "on the fly" from simple first-principles state machines supported by deep-learned components. We evaluate our method on the Olympic Sports and UCF101 datasets, where our model establishes a new state of the art under multiple experimental paradigms. We also extend this method to form a unique framework for zero-shot joint segmentation and classification of activities in video and demonstrate the first results in zero- shot decoding of complex action sequences on a widely-used surgical dataset. Lastly, we show that we can use off-the-shelf object detectors to recognize activities in completely de-novo settings with no additional training.

Downloads

Published

2021-05-18

How to Cite

Kim, T. S., Jones, J., Peven, M., Xiao, Z., Bai, J., Zhang, Y., Qiu, W., Yuille, A., & Hager, G. D. (2021). DASZL: Dynamic Action Signatures for Zero-shot Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 35(3), 1817-1826. https://doi.org/10.1609/aaai.v35i3.16276

Issue

Section

AAAI Technical Track on Computer Vision II