PromptMRG: Diagnosis-Driven Prompts for Medical Report Generation

Authors

  • Haibo Jin Department of Computer Science and Engineering, Hong Kong University of Science and Technology
  • Haoxuan Che Department of Computer Science and Engineering, Hong Kong University of Science and Technology
  • Yi Lin Department of Computer Science and Engineering, Hong Kong University of Science and Technology
  • Hao Chen Department of Computer Science and Engineering, Hong Kong University of Science and Technology Department of Chemical and Biological Engineering, Hong Kong University of Science and Technology

DOI:

https://doi.org/10.1609/aaai.v38i3.28038

Keywords:

CV: Medical and Biological Imaging, CV: Language and Vision

Abstract

Automatic medical report generation (MRG) is of great research value as it has the potential to relieve radiologists from the heavy burden of report writing. Despite recent advancements, accurate MRG remains challenging due to the need for precise clinical understanding and disease identification. Moreover, the imbalanced distribution of diseases makes the challenge even more pronounced, as rare diseases are underrepresented in training data, making their diagnosis unreliable. To address these challenges, we propose diagnosis-driven prompts for medical report generation (PromptMRG), a novel framework that aims to improve the diagnostic accuracy of MRG with the guidance of diagnosis-aware prompts. Specifically, PromptMRG is based on encoder-decoder architecture with an extra disease classification branch. When generating reports, the diagnostic results from the classification branch are converted into token prompts to explicitly guide the generation process. To further improve the diagnostic accuracy, we design cross-modal feature enhancement, which retrieves similar reports from the database to assist the diagnosis of a query image by leveraging the knowledge from a pre-trained CLIP. Moreover, the disease imbalanced issue is addressed by applying an adaptive logit-adjusted loss to the classification branch based on the individual learning status of each disease, which overcomes the barrier of text decoder's inability to manipulate disease distributions. Experiments on two MRG benchmarks show the effectiveness of the proposed method, where it obtains state-of-the-art clinical efficacy performance on both datasets.

Published

2024-03-24

How to Cite

Jin, H., Che, H., Lin, Y., & Chen, H. (2024). PromptMRG: Diagnosis-Driven Prompts for Medical Report Generation. Proceedings of the AAAI Conference on Artificial Intelligence, 38(3), 2607-2615. https://doi.org/10.1609/aaai.v38i3.28038

Issue

Section

AAAI Technical Track on Computer Vision II