Mitigating Entity Hallucinations in 3D Radiology Report Generation via Dual-Stream Alignment

Authors

  • Lingyu Zhou Machine Intelligence Laboratory, College of Computer Science, Sichuan University, Chengdu, China
  • Yue Yu Machine Intelligence Laboratory, College of Computer Science, Sichuan University, Chengdu, China
  • Zhang Yi Machine Intelligence Laboratory, College of Computer Science, Sichuan University, Chengdu, China
  • Xiuyuan Xu Machine Intelligence Laboratory, College of Computer Science, Sichuan University, Chengdu, China

DOI:

https://doi.org/10.1609/aaai.v40i16.38379

Abstract

Entity hallucination poses a major challenge in radiology report generation (RRG), particularly for 3D CT scans where complex spatial contexts amplify factual errors. To address this, medical entity phrases serve as key carriers for multi-modal prompting, integrating expert knowledge into the vision-language model. Current methods use unified cross-attention for volume-phrase alignment, failing to account for anatomical specificity during the alignment process. In this work, we introduce the Dual-stream Entity Alignment Reporting network (DEAR) that separately models organ and lesion entities to resolve anatomical bias. Specifically, the dual-stream entity aligner is designed to partition medical entity phrases into organ and lesion streams, feeding them into separate cross-attention blocks in parallel to achieve fine-grained volume–phrase alignment. For structurally regular and spatially stable organ entities, an organ-guided cross-attention (OGCA) block is proposed to enforce structural consistency by retrieving the top-k voxel tokens via volume–phrase similarity and preserving spatial connectivity through morphological dilation. Meanwhile, a lesion-guided cross-attention (LGCA) block is introduced for structurally irregular and spatially variable lesion entities, enhancing anomaly sensitivity through phrase-weighted attention and refining discriminative boundaries via 3D residual Laplacian filtering. Experiments demonstrate that DEAR significantly reduces entity hallucinations and improves clinical factuality in 3D RRG benchmarks.

Downloads

Published

2026-03-14

How to Cite

Zhou, L., Yu, Y., Yi, Z., & Xu, X. (2026). Mitigating Entity Hallucinations in 3D Radiology Report Generation via Dual-Stream Alignment. Proceedings of the AAAI Conference on Artificial Intelligence, 40(16), 13719–13727. https://doi.org/10.1609/aaai.v40i16.38379

Issue

Section

AAAI Technical Track on Computer Vision XIII