S5: Scalable Semi-Supervised Semantic Segmentation in Remote Sensing

Authors

  • Liang Lv National Engineering Research Center for Multimedia Software, School of Computer Science, Wuhan University
  • Di Wang National Engineering Research Center for Multimedia Software, School of Computer Science, Wuhan University Zhongguancun Academy
  • Jing Zhang National Engineering Research Center for Multimedia Software, School of Computer Science, Wuhan University
  • Lefei Zhang National Engineering Research Center for Multimedia Software, School of Computer Science, Wuhan University

DOI:

https://doi.org/10.1609/aaai.v40i10.37715

Abstract

Semi-supervised semantic segmentation (S4) has advanced remote sensing (RS) analysis by leveraging unlabeled data through pseudo-labeling and consistency learning. However, existing S4 studies often rely on small-scale datasets and models, limiting their practical applicability. To address this, we propose S5, the first scalable framework for semi-supervised semantic segmentation in RS, which unlocks the potential of vast unlabeled Earth observation data typically underutilized due to costly pixel-level annotations. Built upon existing large-scale RS datasets, S5 introduces a data selection strategy that integrates entropy-based filtering and diversity expansion, resulting in the RS4P-1M dataset. Using this dataset, we systematically scale up S4 into a new pretraining paradigm, S4 pre-training (S4P), to pretrain RS foundation models (RSFMs) of varying sizes on this extensive corpus, significantly boosting their performance on land cover segmentation and object detection tasks. Furthermore, during fine-tuning, we incorporate a Mixture-of-Experts (MoE)-based multi-dataset fine-tuning approach, which enables efficient adaptation to multiple RS benchmarks with fewer parameters. This approach improves the generalization and versatility of RSFMs across diverse RS benchmarks. The resulting RSFMs achieve state-of-the-art performance across all benchmarks, underscoring the viability of scaling semi-supervised learning for RS applications.

Published

2026-03-14

How to Cite

Lv, L., Wang, D., Zhang, J., & Zhang, L. (2026). S5: Scalable Semi-Supervised Semantic Segmentation in Remote Sensing. Proceedings of the AAAI Conference on Artificial Intelligence, 40(10), 7726-7734. https://doi.org/10.1609/aaai.v40i10.37715

Issue

Section

AAAI Technical Track on Computer Vision VII