Boost Supervised Pretraining for Visual Transfer Learning: Implications of Self-Supervised Contrastive Representation Learning
Keywords:Computer Vision (CV)
AbstractUnsupervised pretraining based on contrastive learning has made significant progress recently and showed comparable or even superior transfer learning performance to traditional supervised pretraining on various tasks. In this work, we first empirically investigate when and why unsupervised pretraining surpasses supervised counterparts for image classification tasks with a series of control experiments. Besides the commonly used accuracy, we further analyze the results qualitatively with the class activation maps and assess the learned representations quantitatively with the representation entropy and uniformity. Our core finding is that it is the amount of information effectively perceived by the learning model that is crucial to transfer learning, instead of absolute size of the dataset. Based on this finding, we propose Classification Activation Map guided contrastive (CAMtrast) learning which better utilizes the label supervsion to strengthen supervised pretraining, by making the networks perceive more information from the training images. CAMtrast is evaluated with three fundamental visual learning tasks: image recognition, object detection, and semantic segmentation, on various public datasets. Experimental results show that our CAMtrast effectively improves the performance of supervised pretraining, and that its performance is superior to both unsupervised counterparts and a recent related work which similarly attempted improving supervised pretraining.
How to Cite
Sun, J., Wei, D., Ma, K., Wang, L., & Zheng, Y. (2022). Boost Supervised Pretraining for Visual Transfer Learning: Implications of Self-Supervised Contrastive Representation Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 36(2), 2307-2315. https://doi.org/10.1609/aaai.v36i2.20129
AAAI Technical Track on Computer Vision II