Exploratory Machine Learning with Unknown Unknowns


  • Peng Zhao Nanjing University
  • Yu-Jie Zhang Nanjing University
  • Zhi-Hua Zhou Nanjing University




Classification and Regression, Other Foundations of Machine Learning


In conventional supervised learning, a training dataset is given with ground-truth labels from a known label set, and the learned model will classify unseen instances to known labels. In real situations, when the learned models do not work well, users generally attribute the failure to the inadequate selection of learning algorithms or the lack of enough labeled training samples. In this paper, we point out that there is an important category of failure, which owes to the fact that there are unknown classes in the training data misperceived as other labels, and thus their existence was unknown from the given supervision. Such problems of unknown unknown classes can hardly be addressed by common re-selection of algorithms or accumulation of training samples. For this purpose, we propose the exploratory machine learning, where in this paradigm once user encounters unsatisfactory learning performance, she can examine the possibility and, if unknown unknowns really exist, deploy the optimal strategy of feature space augmentation to make the unknown classes observable and be enabled for learning. Theoretical analysis and empirical study on both synthetic and real datasets validate the efficacy of our proposal.




How to Cite

Zhao, P., Zhang, Y.-J., & Zhou, Z.-H. (2021). Exploratory Machine Learning with Unknown Unknowns. Proceedings of the AAAI Conference on Artificial Intelligence, 35(12), 10999-11006. https://doi.org/10.1609/aaai.v35i12.17313



AAAI Technical Track on Machine Learning V