Constructing Superior Representations Beyond the Original Documents via a Contrastive Gaussian Fusion Network for Clustering

Authors

  • Ao Shen Guizhou University
  • Ruizhang Huang Guizhou University
  • Jingjing Xue Guizhou University
  • Ruina Bai Guizhou University

DOI:

https://doi.org/10.1609/aaai.v40i39.40572

Abstract

Document clustering plays an important role in text mining and information retrieval. Existing methods primarily focus on document-intrinsic features, overlooking dataset-level features and consequently failing to construct superior representations. We propose a Contrastive Gaussian Fusion Network (CGFN) that can construct superior representations beyond the original documents. Specifically, CGFN fuses the Gaussian distributions of neighbor-derived information and intrinsic textual features in the latent space. By incorporating contrastive learning into the fusion process, our proposed method is able to learn high-quality representations while simultaneously mitigating noise and minimizing information loss. Experiments on four real-world datasets demonstrate that CGFN outperforms state-of-the-art methods, achieving superior clustering by robustly capturing holistic distributions and neighbor patterns.

Downloads

Published

2026-03-14

How to Cite

Shen, A., Huang, R., Xue, J., & Bai, R. (2026). Constructing Superior Representations Beyond the Original Documents via a Contrastive Gaussian Fusion Network for Clustering. Proceedings of the AAAI Conference on Artificial Intelligence, 40(39), 32911–32919. https://doi.org/10.1609/aaai.v40i39.40572

Issue

Section

AAAI Technical Track on Natural Language Processing IV