Improving Semi-Supervised Support Vector Machines Through Unlabeled Instances Selection

Authors

  • Yu-Feng Li Nanjing University, China
  • Zhi-Hua Zhou Nanjing University, China

DOI:

https://doi.org/10.1609/aaai.v25i1.7920

Abstract

Semi-supervised support vector machines (S3VMs) are a kind of popular approaches which try to improve learning performance by exploiting unlabeled data. Though S3VMs have been found helpful in many situations, they may degenerate performance and the resultant generalization ability may be even worse than using the labeled data only. In this paper, we try to reduce the chance of performance degeneration of S3VMs. Our basic idea is that, rather than exploiting all unlabeled data, the unlabeled instances should be selected such that only the ones which are very likely to be helpful are exploited, while some highly risky unlabeled instances are avoided. We propose the S3VM-us method by using hierarchical clustering to select the unlabeled instances. Experiments on a broad range of data sets over eighty-eight different settings show that the chance of performance degeneration of S3VM-us is much smaller than that of existing S3VMs.

Downloads

Published

2011-08-04

How to Cite

Li, Y.-F., & Zhou, Z.-H. (2011). Improving Semi-Supervised Support Vector Machines Through Unlabeled Instances Selection. Proceedings of the AAAI Conference on Artificial Intelligence, 25(1), 386-391. https://doi.org/10.1609/aaai.v25i1.7920

Issue

Section

AAAI Technical Track: Machine Learning