Test Time Augmentation Meets Post-hoc Calibration: Uncertainty Quantification under Real-World Conditions

Authors

  • Achim Hekler German Cancer Research Center (DKFZ) Heidelberg, Germany Goethe University Frankfurt, Germany
  • Titus J. Brinker German Cancer Research Center (DKFZ) Heidelberg, Germany
  • Florian Buettner German Cancer Research Center (DKFZ) Heidelberg, Germany German Cancer Consortium (DKTK), Germany Goethe University Frankfurt, Germany

DOI:

https://doi.org/10.1609/aaai.v37i12.26735

Keywords:

General

Abstract

Communicating the predictive uncertainty of deep neural networks transparently and reliably is important in many safety-critical applications such as medicine. However, modern neural networks tend to be poorly calibrated, resulting in wrong predictions made with a high confidence. While existing post-hoc calibration methods like temperature scaling or isotonic regression yield strongly calibrated predictions in artificial experimental settings, their efficiency can significantly reduce in real-world applications, where scarcity of labeled data or domain drifts are commonly present. In this paper, we first investigate the impact of these characteristics on post-hoc calibration and introduce an easy-to-implement extension of common post-hoc calibration methods based on test time augmentation. In extensive experiments, we demonstrate that our approach results in substantially better calibration on various architectures. We demonstrate the robustness of our proposed approach on a real-world application for skin cancer classification and show that it facilitates safe decision-making under real-world uncertainties.

Downloads

Published

2023-06-26

How to Cite

Hekler, A., Brinker, T. J., & Buettner, F. (2023). Test Time Augmentation Meets Post-hoc Calibration: Uncertainty Quantification under Real-World Conditions. Proceedings of the AAAI Conference on Artificial Intelligence, 37(12), 14856-14864. https://doi.org/10.1609/aaai.v37i12.26735

Issue

Section

AAAI Special Track on Safe and Robust AI