Exploiting Unlabeled Data via Partial Label Assignment for Multi-Class Semi-Supervised Learning

Zhen-Ru Zhang; Qian-Wen Zhang; Yunbo Cao; Min-Ling Zhang

doi:10.1609/aaai.v35i12.17310

Authors

Zhen-Ru Zhang Southeast University Key Laboratory of Computer Network and Information Integration (Southeast University), Ministry of Education Tencent Cloud Xiaowei
Qian-Wen Zhang Tencent Cloud Xiaowei
Yunbo Cao Tencent Cloud Xiaowei
Min-Ling Zhang Southeast University Key Laboratory of Computer Network and Information Integration (Southeast University), Ministry of Education Collaborative Innovation Center of Wireless Communications Technology

DOI:

https://doi.org/10.1609/aaai.v35i12.17310

Keywords:

Classification and Regression, Semi-Supervised Learning, Multi-class/Multi-label Learning & Extreme Classification

Abstract

In semi-supervised learning, one key strategy in exploiting unlabeled data is trying to estimate its pseudo-label based on current predictive model, where the unlabeled data assigned with pseudo-label is further utilized to enlarge labeled data set for model update. Nonetheless, the supervision information conveyed by pseudo-label is prone to error especially when the performance of initial predictive model is mediocre due to limited amount of labeled data. In this paper, an intermediate unlabeled data exploitation strategy is investigated via partial label assignment, i.e. a set of candidate labels other than a single pseudo-label are assigned to the unlabeled data. We only assume that the ground-truth label of unlabeled data resides in the assigned candidate label set, which is less error-prone than trying to identify the single ground-truth label via pseudo-labeling. Specifically, a multi-class classifier is induced from the partial label examples with candidate labels to facilitate model induction with labeled examples. An iterative procedure is designed to enable labeling information communication between the classifiers induced from partial label examples and labeled examples, whose classification outputs are integrated to yield the final prediction. Comparative studies against state-of-the-art approaches clearly show the effectiveness of the proposed unlabeled data exploitation strategy for multi-class semi-supervised learning.

Exploiting Unlabeled Data via Partial Label Assignment for Multi-Class Semi-Supervised Learning

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription