Graph Masked Autoencoder for Multi-view Remote Sensing Data Clustering

Authors

  • Renxiang Guan National University of Defense Technology
  • Junhong Li Northwest Polytechnical University
  • Siwei Wang Intelligent Game and Decision Lab
  • Tianrui Liu National University of Defense Technology
  • Dayu Hu Northeastern University
  • Miaomiao Li Changsha University
  • Xinwang Liu National University of Defense Technology

DOI:

https://doi.org/10.1609/aaai.v40i26.39286

Abstract

Multi-view graph clustering (MVGC) for remote sensing data has gained increasing attention due to its ability to integrate complementary information across modalities while capturing spatial dependencies in heterogeneous data. Although current methods based on graph contrastive learning achieve strong performance, they often misidentify intra-cluster samples as negatives, leading to class conflicts and reduced clustering accuracy. Graph masked autoencoders have recently shown promising potential in learning robust representations through masked reconstruction, but their application to remote sensing data remains underexplored. This challenge is especially notable in the multi-view remote sensing setting, where high heterogeneity and complex spatial structures increase the difficulty of effective representation learning. To address these issues, we propose Clustering-Guided graph Mask AutoEncoder (CG-MAE), the first framework to extend graph masked autoencoders to multi-view remote sensing clustering. We introduce a clustering-guided masking strategy that selectively masks nodes near cluster centers and intra-cluster edges, which are crucial for capturing key structural information. By reconstructing these masked components, the model is encouraged to focus on learning features that are highly relevant to clustering. To further improve training stability and efficiency, we design an easy-to-hard node masking strategy that enables the model to gradually learn from increasingly challenging patterns. Additionally, we propose a dual self-adaptive learning mechanism that encourages the model to align more closely with the underlying semantic distributions. Extensive experiments on four widely used multi-view remote sensing datasets demonstrate that CG-MAE consistently outperforms state-of-the-art methods in both clustering accuracy and representation quality.

Downloads

Published

2026-03-14

How to Cite

Guan, R., Li, J., Wang, S., Liu, T., Hu, D., Li, M., & Liu, X. (2026). Graph Masked Autoencoder for Multi-view Remote Sensing Data Clustering. Proceedings of the AAAI Conference on Artificial Intelligence, 40(26), 21396-21404. https://doi.org/10.1609/aaai.v40i26.39286

Issue

Section

AAAI Technical Track on Machine Learning III