LLM-CAS: Dynamic Neuron Perturbation for Real-Time Hallucination Correction

Authors

  • Jusheng Zhang SUN YAT-SEN UNIVERSITY
  • Ningyuan Liu SUN YAT-SEN UNIVERSITY
  • Yijia Fan SUN YAT-SEN UNIVERSITY
  • Zihao Huang SUN YAT-SEN UNIVERSITY
  • Qinglin Zeng SUN YAT-SEN UNIVERSITY
  • Kaitong Cai SUN YAT-SEN UNIVERSITY
  • Jian Wang Snap Inc.
  • Keze Wang SUN YAT-SEN UNIVERSITY

DOI:

https://doi.org/10.1609/aaai.v40i41.40776

Abstract

Large language models (LLMs) often generate hallucinated content lacking factual or contextual grounding, hindering their reliability in critical applications. Traditional methods like supervised fine-tuning and reinforcement learning from human feedback are data-intensive and computationally expensive, while static parameter editing struggles with context-dependent errors and catastrophic forgetting. To overcome these limitations, we introduce LLM-CAS, a framework that formulates real-time hallucination correction as a hierarchical reinforcement learning (HRL) problem. LLM-CAS trains an agent to learn a sophisticated policy, dynamically selecting optimal, temporary neuron perturbations during inference based on the immediate context. This learned, policy-driven approach provides greater adaptability than prior dynamic methods that rely on heuristic or pre-defined adjustments. As a result, LLM-CAS achieves significant performance gains across various LLMs, improving accuracy by 10.98 percentage points on StoryCloze, 2.71 points on TriviaQA, and 2.06 points on TruthfulQA's MC1 score, thereby outperforming static methods like ITI and CAA, as well as the dynamic SADI framework. This context-aware, efficient approach promises enhanced reliability for LLMs in high-stakes domains, with future potential for multimodal extensions.

Downloads

Published

2026-03-14

How to Cite

Zhang, J., Liu, N., Fan, Y., Huang, Z., Zeng, Q., Cai, K., Wang, J., & Wang, K. (2026). LLM-CAS: Dynamic Neuron Perturbation for Real-Time Hallucination Correction. Proceedings of the AAAI Conference on Artificial Intelligence, 40(41), 34746-34754. https://doi.org/10.1609/aaai.v40i41.40776

Issue

Section

AAAI Technical Track on Natural Language Processing VI