Learning Saliency-Free Model with Generic Features for Weakly-Supervised Semantic Segmentation
Current weakly-supervised semantic segmentation methods often estimate initial supervision from class activation maps (CAM), which produce sparse discriminative object seeds and rely on image saliency to provide background cues when only class labels are used. To eliminate the demand of extra data for training saliency detector, we propose to discover class pattern inherent in the lower layer convolution features, which are scarcely explored as in previous CAM methods. Specifically, we first project the convolution features into a low-dimension space and then decide on a decision boundary to generate class-agnostic maps for each semantic category that exists in the image. Features from Lower layer are more generic, thus capable of generating proxy ground-truth with more accurate and integral objects. Experiments on the PASCAL VOC 2012 dataset show that the proposed saliency-free method outperforms the previous approaches under the same weakly-supervised setting and achieves superior segmentation results, which are 64.5% on the validation set and 64.6% on the test set concerning mIoU metric.