Medverse: A Universal Model for Full-Resolution 3D Medical Image Segmentation, Transformation and Enhancement

Authors

  • Jiesi Hu Harbin Institute of Technology (Shenzhen) Pengcheng Loboratory
  • Jianfeng Cao Harbin Institute of Technology (Shenzhen)
  • Yanwu Yang University Hospital Tübingen German Center for Mental Health
  • Chenfei Ye Harbin Institute of Technology (Shenzhen)
  • Yixuan Zhang Harbin Institute of Technology (Shenzhen)
  • Hanyang Peng Pengcheng Loboratory
  • Ting Ma Harbin Institute of Technology (Shenzhen) Pengcheng Loboratory

DOI:

https://doi.org/10.1609/aaai.v40i6.42490

Abstract

In-context learning (ICL) offers a promising paradigm for universal medical image analysis, enabling models to perform diverse image processing tasks without retraining. However, current ICL models for medical imaging remain limited in two critical aspects: they cannot simultaneously achieve high-fidelity predictions and global anatomical understanding, and there is no unified model trained across diverse medical imaging tasks (e.g., segmentation and enhancement) and anatomical regions. As a result, the full potential of ICL in medical imaging remains underexplored. Thus, we present Medverse, a universal ICL model for 3D medical imaging, trained on 22 datasets covering diverse tasks in universal image segmentation, transformation, and enhancement across multiple organs, imaging modalities, and clinical centers. Medverse employs a next-scale autoregressive in-context learning framework that progressively refines predictions from coarse to fine, generating consistent, full-resolution volumetric outputs and enabling multi-scale anatomical awareness. We further propose a blockwise cross-attention module that facilitates long-range interactions between context and target inputs while preserving computational efficiency through spatial sparsity. Medverse is extensively evaluated on a broad collection of held-out datasets covering previously unseen clinical centers, organs, species, and imaging modalities. Results demonstrate that Medverse substantially outperforms existing ICL baselines and establishes a novel paradigm for in-context learning.

Downloads

Published

2026-03-14

How to Cite

Hu, J., Cao, J., Yang, Y., Ye, C., Zhang, Y., Peng, H., & Ma, T. (2026). Medverse: A Universal Model for Full-Resolution 3D Medical Image Segmentation, Transformation and Enhancement. Proceedings of the AAAI Conference on Artificial Intelligence, 40(6), 4869–4877. https://doi.org/10.1609/aaai.v40i6.42490

Issue

Section

AAAI Technical Track on Computer Vision III