Extreme Low Resolution Activity Recognition With Multi-Siamese Embedding Learning


  • Michael Ryoo EgoVid Inc.
  • Kiyoon Kim EgoVid Inc.
  • Hyun Yang EgoVid Inc.




human activity recognition, extreme low resolution videos, privacy-preserving recognition


This paper presents an approach for recognizing human activities from extreme low resolution (e.g., 16x12) videos. Extreme low resolution recognition is not only necessary for analyzing actions at a distance but also is crucial for enabling privacy-preserving recognition of human activities. We design a new two-stream multi-Siamese convolutional neural network. The idea is to explicitly capture the inherent property of low resolution (LR) videos that two images originated from the exact same scene often have totally different pixel values depending on their LR transformations. Our approach learns the shared embedding space that maps LR videos with the same content to the same location regardless of their transformations. We experimentally confirm that our approach of jointly learning such transform robust LR video representation and the classifier outperforms the previous state-of-the-art low resolution recognition approaches on two public standard datasets by a meaningful margin.




How to Cite

Ryoo, M., Kim, K., & Yang, H. (2018). Extreme Low Resolution Activity Recognition With Multi-Siamese Embedding Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 32(1). https://doi.org/10.1609/aaai.v32i1.12299