Organ-Aware Routing Mixture-of-Retrieval Augmented Generation for Fetal Ultrasound Reporting
DOI:
https://doi.org/10.1609/aaai.v40i10.37795Abstract
Fetal ultrasound screening is a uniquely complex diagnostic task involving the simultaneous assessment of multiple fetal organs—each with its own anatomical and clinical context—within a single examination. Automating report generation for such cases poses a significant challenge: unlike existing methods that focus on single-organ radiology tasks (e.g., chest X-rays), fetal ultrasound requires reasoning over a structured, multiple-to-multiple setting, i.e., multi-organ images corresponding to a multi-section report. In this paper, we introduce FetusR, the first large-scale dataset for multi-organ fetal ultrasound reporting, containing 15,594 real-world cases with rich organ-wise annotations. To address the intrinsic image-report alignment, we propose Organ-Aware Routing Mixture-of-Retrieval Augmented Generation (ORM-RAG) inspired by the Mixture-of-Experts paradigm. Our method decomposes the complex alignment problem into multiple one-to-one sub-retrieval tasks. Specifically, ORM-RAG integrates (1) an organ-aware mixture-of-retrieval module that partitions the retrieval space into organ-specific corpora for independent retrieval, and (2) a dynamic routing mechanism that selectively aggregates high-confidence organ-specific reports while filtering uncertain ones. Extensive experiments demonstrate that our method significantly outperforms state-of-the-art baselines across both textual similarity and clinical accuracy metrics. Our work opens a new direction for long-form, structured report generation in real-world, multi-organ medical imaging scenarios.Published
2026-03-14
How to Cite
Pu, B., Wang, S., Li, R., Ding, X., Zhao, L., Chen, C., … Li, K. (2026). Organ-Aware Routing Mixture-of-Retrieval Augmented Generation for Fetal Ultrasound Reporting. Proceedings of the AAAI Conference on Artificial Intelligence, 40(10), 8448–8456. https://doi.org/10.1609/aaai.v40i10.37795
Issue
Section
AAAI Technical Track on Computer Vision VII