ClassFormer: Exploring Class-Aware Dependency with Transformer for Medical Image Segmentation

Huimin Huang; Shiao Xie; Lanfen Lin; Ruofeng Tong; Yen-Wei Chen; Hong Wang; Yuexiang Li; Yawen Huang; Yefeng Zheng

doi:10.1609/aaai.v37i1.25171

Authors

Huimin Huang Zhejiang University
Shiao Xie Zhejiang University
Lanfen Lin Zhejiang University
Ruofeng Tong Zhejiang University Zhejiang Lab
Yen-Wei Chen Ritsumeikan University
Hong Wang Tencent Jarvis Lab
Yuexiang Li Tencent Jarvis Lab
Yawen Huang Tencent Jarvis Lab
Yefeng Zheng Tencent Jarvis Lab

DOI:

https://doi.org/10.1609/aaai.v37i1.25171

Keywords:

CV: Segmentation, CV: Medical and Biological Imaging

Abstract

Vision Transformers have recently shown impressive performances on medical image segmentation. Despite their strong capability of modeling long-range dependencies, the current methods still give rise to two main concerns in a class-level perspective: (1) intra-class problem: the existing methods lacked in extracting class-specific correspondences of different pixels, which may lead to poor object coverage and/or boundary prediction; (2) inter-class problem: the existing methods failed to model explicit category-dependencies among various objects, which may result in inaccurate localization. In light of these two issues, we propose a novel transformer, called ClassFormer, powered by two appealing transformers, i.e., intra-class dynamic transformer and inter-class interactive transformer, to address the challenge of fully exploration on compactness and discrepancy. Technically, the intra-class dynamic transformer is first designed to decouple representations of different categories with an adaptive selection mechanism for compact learning, which optimally highlights the informative features to reflect the salient keys/values from multiple scales. We further introduce the inter-class interactive transformer to capture the category dependency among different objects, and model class tokens as the representative class centers to guide a global semantic reasoning. As a consequence, the feature consistency is ensured with the expense of intra-class penalization, while inter-class constraint strengthens the feature discriminability between different categories. Extensive empirical evidence shows that ClassFormer can be easily plugged into any architecture, and yields improvements over the state-of-the-art methods in three public benchmarks.

ClassFormer: Exploring Class-Aware Dependency with Transformer for Medical Image Segmentation

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Subscription