MAISI-v2: Accelerated 3D High-Resolution Medical Image Synthesis with Rectified Flow and Region-specific Contrastive Loss

Authors

  • Can Zhao NVIDIA
  • Pengfei Guo NVIDIA
  • Dong Yang NVIDIA
  • Yufan He NVIDIA
  • Yucheng Tang NVIDIA
  • Benjamin Simon National Institutes of Health University of Oxford
  • Mason Belue University of Arkansas for Medical Sciences
  • Stephanie Harmon National Institutes of Health
  • Baris Turkbey National Institutes of Health
  • Daguang Xu NVIDIA

DOI:

https://doi.org/10.1609/aaai.v40i15.38309

Abstract

Medical image synthesis is an important topic for both clinical and research applications. Recently, diffusion models have become a leading approach in this area. Despite their strengths, many existing methods struggle with (1) limited generalizability, only working for specific body regions or voxel spacings, (2) slow inference, which is a common issue for diffusion models, and (3) weak alignment with input conditions, which is a critical issue for medical imaging. MAISI, a previously proposed framework, addresses generalizability issues but still suffers from slow inference and limited condition consistency. In this work, we present MAISI-v2, the first accelerated 3D medical image synthesis framework that integrates rectified flow to enable fast and high-quality generation. To further enhance condition fidelity, we introduce a novel region-specific contrastive loss to improve sensitivity to the region of interest. Our experiments show that MAISI-v2 can achieve state-of-the-art image quality with 33× acceleration for latent diffusion models. We also conducted a downstream segmentation experiment to show that the synthetic images can be used for data augmentation. We release our code, training details, model weights, and a GUI demo to facilitate reproducibility and promote further development within the community.

Downloads

Published

2026-03-14

How to Cite

Zhao, C., Guo, P., Yang, D., He, Y., Tang, Y., Simon, B., … Xu, D. (2026). MAISI-v2: Accelerated 3D High-Resolution Medical Image Synthesis with Rectified Flow and Region-specific Contrastive Loss. Proceedings of the AAAI Conference on Artificial Intelligence, 40(15), 13088–13098. https://doi.org/10.1609/aaai.v40i15.38309

Issue

Section

AAAI Technical Track on Computer Vision XII