PriorRG: Prior-Guided Contrastive Pre-training and Coarse-to-Fine Decoding for Chest X-ray Report Generation

Authors

  • Kang Liu School of Computer Science and Technology, Xidian University, Xi'an, Shaanxi 710071, China Xi'an Key Laboratory of Big Data and Intelligent Vision, Xi'an, Shaanxi 710071, China Key Laboratory of Collaborative Intelligence Systems, Ministry of Education, Xidian University, Xi'an 710071, China
  • Zhuoqi Ma School of Computer Science and Technology, Xidian University, Xi'an, Shaanxi 710071, China Xi'an Key Laboratory of Big Data and Intelligent Vision, Xi'an, Shaanxi 710071, China Key Laboratory of Collaborative Intelligence Systems, Ministry of Education, Xidian University, Xi'an 710071, China
  • Zikang Fang School of Computer Science and Technology, Xidian University, Xi'an, Shaanxi 710071, China
  • Yunan Li School of Computer Science and Technology, Xidian University, Xi'an, Shaanxi 710071, China Xi'an Key Laboratory of Big Data and Intelligent Vision, Xi'an, Shaanxi 710071, China Key Laboratory of Collaborative Intelligence Systems, Ministry of Education, Xidian University, Xi'an 710071, China
  • Kun Xie School of Computer Science and Technology, Xidian University, Xi'an, Shaanxi 710071, China Xi'an Key Laboratory of Big Data and Intelligent Vision, Xi'an, Shaanxi 710071, China Key Laboratory of Collaborative Intelligence Systems, Ministry of Education, Xidian University, Xi'an 710071, China
  • Qiguang Miao School of Computer Science and Technology, Xidian University, Xi'an, Shaanxi 710071, China Xi'an Key Laboratory of Big Data and Intelligent Vision, Xi'an, Shaanxi 710071, China Key Laboratory of Collaborative Intelligence Systems, Ministry of Education, Xidian University, Xi'an 710071, China

DOI:

https://doi.org/10.1609/aaai.v40i9.37657

Abstract

Chest X-ray report generation aims to reduce radiologists' workload by automatically producing high-quality preliminary reports. A critical yet underexplored aspect of this task is the effective use of patient-specific prior knowledge---including clinical context (e.g., symptoms, medical history) and the most recent prior image---which radiologists routinely rely on for diagnostic reasoning. Most existing methods generate reports from single images, neglecting this essential prior information and thus failing to capture diagnostic intent or disease progression. To bridge this gap, we propose PriorRG, a novel chest X-ray report generation framework that emulates real-world clinical workflows via a two-stage training pipeline. In Stage 1, we introduce a prior-guided contrastive pre-training scheme that leverages clinical context to guide spatiotemporal feature extraction, allowing the model to align more closely with the intrinsic spatiotemporal semantics in radiology reports. In Stage 2, we present a prior-aware coarse-to-fine decoding for report generation that progressively integrates patient-specific prior knowledge with the vision encoder's hidden states. This decoding allows the model to align with diagnostic focus and track disease progression, thereby enhancing the clinical accuracy and fluency of the generated reports. Extensive experiments on MIMIC-CXR and MIMIC-ABN datasets demonstrate that PriorRG outperforms state-of-the-art methods, achieving a 3.6% BLEU-4 and 3.8% F1 score improvement on MIMIC-CXR, and a 5.9% BLEU-1 gain on MIMIC-ABN.

Downloads

Published

2026-03-14

How to Cite

Liu, K., Ma, Z., Fang, Z., Li, Y., Xie, K., & Miao, Q. (2026). PriorRG: Prior-Guided Contrastive Pre-training and Coarse-to-Fine Decoding for Chest X-ray Report Generation. Proceedings of the AAAI Conference on Artificial Intelligence, 40(9), 7206–7214. https://doi.org/10.1609/aaai.v40i9.37657

Issue

Section

AAAI Technical Track on Computer Vision VI