Deep Object Co-Segmentation via Spatial-Semantic Network Modulation

Kaihua Zhang; Jin Chen; Bo Liu; Qingshan Liu

doi:10.1609/aaai.v34i07.6977

Authors

Kaihua Zhang Nanjing University of Information Science and Technology
Jin Chen Nanjing University of Information Science and Technology
Bo Liu JD Finance America Corporation
Qingshan Liu Nanjing University of Information Science and Technology

DOI:

https://doi.org/10.1609/aaai.v34i07.6977

Abstract

Object co-segmentation is to segment the shared objects in multiple relevant images, which has numerous applications in computer vision. This paper presents a spatial and semantic modulated deep network framework for object co-segmentation. A backbone network is adopted to extract multi-resolution image features. With the multi-resolution features of the relevant images as input, we design a spatial modulator to learn a mask for each image. The spatial modulator captures the correlations of image feature descriptors via unsupervised learning. The learned mask can roughly localize the shared foreground object while suppressing the background. For the semantic modulator, we model it as a supervised image classification task. We propose a hierarchical second-order pooling module to transform the image features for classification use. The outputs of the two modulators manipulate the multi-resolution features by a shift-and-scale operation so that the features focus on segmenting co-object regions. The proposed model is trained end-to-end without any intricate post-processing. Extensive experiments on four image co-segmentation benchmark datasets demonstrate the superior accuracy of the proposed method compared to state-of-the-art methods. The codes are available at http://kaihuazhang.net/.

Deep Object Co-Segmentation via Spatial-Semantic Network Modulation

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information