Cross-Modal Subspace Clustering via Deep Canonical Correlation Analysis


  • Quanxue Gao Xidian University
  • Huanhuan Lian Xidian University
  • Qianqian Wang Xidian University
  • Gan Sun Chinese Academy of Sciences



For cross-modal subspace clustering, the key point is how to exploit the correlation information between cross-modal data. However, most hierarchical and structural correlation information among cross-modal data cannot be well exploited due to its high-dimensional non-linear property. To tackle this problem, in this paper, we propose an unsupervised framework named Cross-Modal Subspace Clustering via Deep Canonical Correlation Analysis (CMSC-DCCA), which incorporates the correlation constraint with a self-expressive layer to make full use of information among the inter-modal data and the intra-modal data. More specifically, the proposed model consists of three components: 1) deep canonical correlation analysis (Deep CCA) model; 2) self-expressive layer; 3) Deep CCA decoders. The Deep CCA model consists of convolutional encoders and correlation constraint. Convolutional encoders are used to obtain the latent representations of cross-modal data, while adding the correlation constraint for the latent representations can make full use of the information of the inter-modal data. Furthermore, self-expressive layer works on latent representations and constrain it perform self-expression properties, which makes the shared coefficient matrix could capture the hierarchical intra-modal correlations of each modality. Then Deep CCA decoders reconstruct data to ensure that the encoded features can preserve the structure of the original data. Experimental results on several real-world datasets demonstrate the proposed method outperforms the state-of-the-art methods.




How to Cite

Gao, Q., Lian, H., Wang, Q., & Sun, G. (2020). Cross-Modal Subspace Clustering via Deep Canonical Correlation Analysis. Proceedings of the AAAI Conference on Artificial Intelligence, 34(04), 3938-3945.



AAAI Technical Track: Machine Learning