A Robust Mutual-Reinforcing Framework for 3D Multi-Modal Medical Image Fusion Based on Visual-Semantic Consistency

Authors

  • Hao Zhang Wuhan University
  • Xuhui Zuo Wuhan University
  • Huabing Zhou Wuhan Institute of Technology
  • Tao Lu Wuhan Institute of Technology
  • Jiayi Ma Wuhan University

DOI:

https://doi.org/10.1609/aaai.v38i7.28536

Keywords:

CV: Multi-modal Vision, CV: Medical and Biological Imaging

Abstract

This work proposes a robust 3D medical image fusion framework to establish a mutual-reinforcing mechanism between visual fusion and lesion segmentation, achieving their double improvement. Specifically, we explore the consistency between vision and semantics by sharing feature fusion modules. Through the coupled optimization of the visual fusion loss and the lesion segmentation loss, visual-related and semantic-related features will be pulled into the same domain, effectively promoting accuracy improvement in a mutual-reinforcing manner. Further, we establish the robustness guarantees by constructing a two-level refinement constraint in the process of feature extraction and reconstruction. Benefiting from full consideration for common degradations in medical images, our framework can not only provide clear visual fusion results for doctor's observation, but also enhance the defense ability of lesion segmentation against these negatives. Extensive evaluations of visual fusion and lesion segmentation scenarios demonstrate the advantages of our method in terms of accuracy and robustness. Moreover, our proposed framework is generic, which can be well-compatible with existing lesion segmentation algorithms and improve their performance. The code is publicly available at https://github.com/HaoZhang1018/RMR-Fusion.

Published

2024-03-24

How to Cite

Zhang, H., Zuo, X., Zhou, H., Lu, T., & Ma, J. (2024). A Robust Mutual-Reinforcing Framework for 3D Multi-Modal Medical Image Fusion Based on Visual-Semantic Consistency. Proceedings of the AAAI Conference on Artificial Intelligence, 38(7), 7087-7095. https://doi.org/10.1609/aaai.v38i7.28536

Issue

Section

AAAI Technical Track on Computer Vision VI