Confidence Calibration in Large Language Models for
Uncertainty Quantification: Affecting Calibration with
Conditional Weight Updates

Sophia Somers; Edward Kim

doi:10.1609/aaaiss.v7i1.36937

Confidence Calibration in Large Language Models for Uncertainty Quantification: Affecting Calibration with Conditional Weight Updates

Authors

Sophia Somers Drexel University
Edward Kim Drexel University

DOI:

https://doi.org/10.1609/aaaiss.v7i1.36937

Abstract

In any medical applications of Large Language Models (LLMs), it is critical to have accurate uncertainty quantification, as well as control over the over- and under-confidence of the model. Current fine-tuning (FT) methods lack this control, partly because they fail to account for the fact that repeated exposure to a fact does not make it more correct. We propose a revised FT method that updates model weights only when the model does not sufficiently “know” an answer. We fine-tuned Meta's Llama-3.2, 1B parameter model on the MMLU multiple-choice dataset using traditional FT methods for a Control Model and Conditional Update FT for an Experi-mental Model. The tuned models showed different results, with the Control showing greater overconfidence and the Experimental Model showing greater under-confidence as compared to the Base Model. Additionally, the Experimental Model showed a more even distribution of confidence scores, which is advantageous for post-calibration. This method for affecting confidence calibration while fi-ne-tuning LLMs may potentially help in the broader challenge of creating reliable and trustworthy LLMs.

AAAI Fall Symposium 2025 Proceedings Cover

Downloads

Published

2025-11-23

How to Cite

Somers, S., & Kim, E. (2025). Confidence Calibration in Large Language Models for Uncertainty Quantification: Affecting Calibration with Conditional Weight Updates. Proceedings of the AAAI Symposium Series, 7(1), 590–593. https://doi.org/10.1609/aaaiss.v7i1.36937

Download Citation

Issue

Vol. 7 No. 1: Proceedings of the 2025 AAAI Fall Symposium Series

Section

Safe, Ethical, Certified, Uncertainty-aware, Robust, and Explainable AI for Health (SECURE-AI4H)