Exploratory Machine Learning with Unknown Unknowns

Authors

  • Peng Zhao Nanjing University
  • Yu-Jie Zhang Nanjing University
  • Zhi-Hua Zhou Nanjing University

Keywords:

Classification and Regression, Other Foundations of Machine Learning

Abstract

In conventional supervised learning, a training dataset is given with ground-truth labels from a known label set, and the learned model will classify unseen instances to known labels. In real situations, when the learned models do not work well, users generally attribute the failure to the inadequate selection of learning algorithms or the lack of enough labeled training samples. In this paper, we point out that there is an important category of failure, which owes to the fact that there are unknown classes in the training data misperceived as other labels, and thus their existence was unknown from the given supervision. Such problems of unknown unknown classes can hardly be addressed by common re-selection of algorithms or accumulation of training samples. For this purpose, we propose the exploratory machine learning, where in this paradigm once user encounters unsatisfactory learning performance, she can examine the possibility and, if unknown unknowns really exist, deploy the optimal strategy of feature space augmentation to make the unknown classes observable and be enabled for learning. Theoretical analysis and empirical study on both synthetic and real datasets validate the efficacy of our proposal.

Downloads

Published

2021-05-18

How to Cite

Zhao, P., Zhang, Y.-J., & Zhou, Z.-H. (2021). Exploratory Machine Learning with Unknown Unknowns. Proceedings of the AAAI Conference on Artificial Intelligence, 35(12), 10999-11006. Retrieved from https://ojs.aaai.org/index.php/AAAI/article/view/17313

Issue

Section

AAAI Technical Track on Machine Learning V