LDS2AE: Local Diffusion Shared-Specific Autoencoder for Multimodal Remote Sensing Image Classification with Arbitrary Missing Modalities
DOI:
https://doi.org/10.1609/aaai.v38i13.29391Keywords:
ML: Classification and Regression, ML: Deep Generative Models & AutoencodersAbstract
Recent research on the joint classification of multimodal remote sensing data has achieved great success. However, due to the limitations imposed by imaging conditions, the case of missing modalities often occurs in practice. Most previous researchers regard the classification in case of different missing modalities as independent tasks. They train a specific classification model for each fixed missing modality by extracting multimodal joint representation, which cannot handle the classification of arbitrary (including multiple and random) missing modalities. In this work, we propose a local diffusion shared-specific autoencoder (LDS2AE), which solves the classification of arbitrary missing modalities with a single model. The LDS2AE captures the data distribution of different modalities to learn multimodal shared feature for classification by designing a novel local diffusion autoencoder which consists of a modality-shared encoder and several modality-specific decoders. The modality-shared encoder is designed to extract multimodal shared feature by employing the same parameters to map multimodal data into a shared subspace. The modality-specific decoders put the multimodal shared feature to reconstruct the image of each modality, which facilitates the shared feature to learn unique information of different modalities. In addition, we incorporate masked training to the diffusion autoencoder to achieve local diffusion, which significantly reduces the training cost of model. The approach is tested on widely-used multimodal remote sensing datasets, demonstrating the effectiveness of the proposed LDS2AE in addressing the classification of arbitrary missing modalities. The code is available at https://github.com/Jiahuiqu/LDS2AE.Downloads
Published
2024-03-24
How to Cite
Qu, J., Yang, Y., Dong, W., & Yang, Y. (2024). LDS2AE: Local Diffusion Shared-Specific Autoencoder for Multimodal Remote Sensing Image Classification with Arbitrary Missing Modalities. Proceedings of the AAAI Conference on Artificial Intelligence, 38(13), 14731-14739. https://doi.org/10.1609/aaai.v38i13.29391
Issue
Section
AAAI Technical Track on Machine Learning IV