Beyond Pixel and Object: Part Feature as Reference for Few-Shot Video Object Segmentation

Naisong Luo; Guoxin Xiong; Tianzhu Zhang

doi:10.1609/aaai.v39i6.32626

Authors

Naisong Luo MoE Key Laboratory of Brain-inspired Intelligent Perception and Cognition, University of Science and Technology of China
Guoxin Xiong MoE Key Laboratory of Brain-inspired Intelligent Perception and Cognition, University of Science and Technology of China
Tianzhu Zhang MoE Key Laboratory of Brain-inspired Intelligent Perception and Cognition, University of Science and Technology of China

DOI:

https://doi.org/10.1609/aaai.v39i6.32626

Abstract

Few-Shot Video Object Segmentation (FSVOS) aims to achieve accurate segmentation of video sequences supported by limited annotated images. In this work, we analyze the deficiencies inherent in the use of object prototypes and pixel features as references in previous methods. Then we shed light on that part features, with the ability to adapt to appearance variations and resist noise, are advantageous as representative reference features for aligning support images and query videos. Therefore, we propose a Part Agent Learning Network (PALN) to leverage part features from two aspects. First, we elaborately employ Optimal Transport algorithm with equal partition constraint to make part agents capable of dividing support objects into diverse parts in an adaptive manner. Second, we design a dedicated cache mechanism to learn temporal part agents as lightweight historic target representation to exploit temporal consistency. With the aid of these learned part agents, our PALN can effectively achieve support-query alignment and temporal alignment for accurate segmentation of query videos. Extensive experimental results on two challenging benchmarks demonstrate that our method performs favorably against state-of-the-art FSVOS methods.

Beyond Pixel and Object: Part Feature as Reference for Few-Shot Video Object Segmentation

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information