Boosting Multiple Instance Learning Models for Whole Slide Image Classification: A Model-Agnostic Framework Based on Counterfactual Inference

Authors

  • Weiping Lin Xiamen University
  • Zhenfeng Zhuang Xiamen University
  • Lequan Yu The University of Hong Kong
  • Liansheng Wang Xiamen University

DOI:

https://doi.org/10.1609/aaai.v38i4.28135

Keywords:

CV: Medical and Biological Imaging, CV: Applications

Abstract

Multiple instance learning is an effective paradigm for whole slide image (WSI) classification, where labels are only provided at the bag level. However, instance-level prediction is also crucial as it offers insights into fine-grained regions of interest. Existing multiple instance learning methods either solely focus on training a bag classifier or have the insufficient capability of exploring instance prediction. In this work, we propose a novel model-agnostic framework to boost existing multiple instance learning models, to improve the WSI classification performance in both bag and instance levels. Specifically, we propose a counterfactual inference-based sub-bag assessment method and a hierarchical instance searching strategy to help to search reliable instances and obtain their accurate pseudo labels. Furthermore, an instance classifier is well-trained to produce accurate predictions. The instance embedding it generates is treated as a prompt to refine the instance feature for bag prediction. This framework is model-agnostic, capable of adapting to existing multiple instance learning models, including those without specific mechanisms like attention. Extensive experiments on three datasets demonstrate the competitive performance of our method. Code will be available at https://github.com/centurion-crawler/CIMIL.

Published

2024-03-24

How to Cite

Lin, W., Zhuang, Z., Yu, L., & Wang, L. (2024). Boosting Multiple Instance Learning Models for Whole Slide Image Classification: A Model-Agnostic Framework Based on Counterfactual Inference. Proceedings of the AAAI Conference on Artificial Intelligence, 38(4), 3477-3485. https://doi.org/10.1609/aaai.v38i4.28135

Issue

Section

AAAI Technical Track on Computer Vision III