Post-hoc Uncertainty Learning Using a Dirichlet Meta-Model

Authors

  • Maohao Shen MIT
  • Yuheng Bu University of Florida
  • Prasanna Sattigeri IBM Research
  • Soumya Ghosh IBM Research
  • Subhro Das MIT-IBM Watson AI Lab, IBM Research
  • Gregory Wornell MIT

DOI:

https://doi.org/10.1609/aaai.v37i8.26167

Keywords:

ML: Calibration & Uncertainty Quantification

Abstract

It is known that neural networks have the problem of being over-confident when directly using the output label distribution to generate uncertainty measures. Existing methods mainly resolve this issue by retraining the entire model to impose the uncertainty quantification capability so that the learned model can achieve desired performance in accuracy and uncertainty prediction simultaneously. However, training the model from scratch is computationally expensive, and a trade-off might exist between prediction accuracy and uncertainty quantification. To this end, we consider a more practical post-hoc uncertainty learning setting, where a well-trained base model is given, and we focus on the uncertainty quantification task at the second stage of training. We propose a novel Bayesian uncertainty learning approach using the Dirichlet meta-model, which is effective and computationally efficient. Our proposed method requires no additional training data and is flexible enough to quantify different uncertainties and easily adapt to different application settings, including out-of-domain data detection, misclassification detection, and trustworthy transfer learning. Finally, we demonstrate our proposed meta-model approach's flexibility and superior empirical performance on these applications over multiple representative image classification benchmarks.

Downloads

Published

2023-06-26

How to Cite

Shen, M., Bu, Y., Sattigeri, P., Ghosh, S., Das, S., & Wornell, G. (2023). Post-hoc Uncertainty Learning Using a Dirichlet Meta-Model. Proceedings of the AAAI Conference on Artificial Intelligence, 37(8), 9772-9781. https://doi.org/10.1609/aaai.v37i8.26167

Issue

Section

AAAI Technical Track on Machine Learning III