Deep Multi-modal Graph Clustering via Graph Transformer Network
DOI:
https://doi.org/10.1609/aaai.v39i8.32844Abstract
Current deep multi-modal graph clustering methods primarily rely on Graph Neural Network (GNN) to fully exploit attribute features and graph structures, including message propagation and low-dimensional feature embedding. However, these methods lack further exploration of graph structural information, such as the relationship between nodes and shortest paths. Additionally, they may not sufficiently mine complementary information among multi-modal graph data. To address these issues, we propose a novel Deep Multi-modal Graph Clustering via Graph Transformer Network method, called DMGC-GTN. This method thoroughly dissects and utilizes graph structural information, applying graph smoothing to node features and incorporating various forms of embeddings into the transformer architecture. This achieves a unified embedding of graph structure and multi-modal feature attributes, fully exploiting the complementary information within multi-modal graph data. Extensive experiments demonstrate the effectiveness of our algorithm.Downloads
Published
2025-04-11
How to Cite
Wang, Q., Xu, H., Zhang, Z., Feng, W., & Gao, Q. (2025). Deep Multi-modal Graph Clustering via Graph Transformer Network. Proceedings of the AAAI Conference on Artificial Intelligence, 39(8), 7835–7843. https://doi.org/10.1609/aaai.v39i8.32844
Issue
Section
AAAI Technical Track on Computer Vision VII