A General Formulation for Safely Exploiting Weakly Supervised Data
Keywords:weakly supervised learning
Weakly supervised data helps improve learning performance, which is an important machine learning data. However, recent results indicate that machine learning techniques with the usage of weakly supervised data may sometimes lead to performance degradation. How to safely leverage weakly supervised data has become an important issue, whereas there is only very limited effort, especially on a general formulation to help provide insight to understand safe weakly supervised learning. In this paper we present a scheme, which builds the final prediction results by integrating several weakly supervised learners. Our resultant formulation brings two implications. i) It has safeness guarantees for the commonly used convex loss functions in both regression and classification tasks of weakly supervised learning; ii) It can embed uncertain prior knowledge about the importance of base learners flexibly. Moreover, our formulation can be addressed globally by simple convex quadratic program or linear program in an efficient manner. Experiments on multiple weakly supervised learning tasks such as label noise learning, domain adaptation and semi-supervised learning validate the effectiveness of our proposed algorithms.