iCD: An Implicit Clustering Distillation Method for Structural Information Mining (Student Abstract)

Authors

  • Xiang Xue Inner Mongolia University of Technology
  • Yatu Ji Inner Mongolia University of Technology
  • Qing-Dao-Er-Ji Ren Inner Mongolia University of Technology
  • Bao Shi Inner Mongolia University of Technology
  • Min Lu Inner Mongolia University of Technology
  • Nier Wu Inner Mongolia University of Technology
  • Xufei Zhuang Inner Mongolia University of Technology
  • Haiteng Xu Inner Mongolia University of Technology
  • Gan-Qi-Qi-Ge Cha Inner Mongolia Autonomous Region Water Conservancy Development Center

DOI:

https://doi.org/10.1609/aaai.v40i48.42297

Abstract

Logit Knowledge Distillation has gained substantial research interest in recent years due to its simplicity and lack of requirement for intermediate feature alignment; however, it suffers from limited interpretability in its decision-making process. To address this, we propose implicit Clustering Distillation (iCD): a simple and effective method that mines and transfers interpretable structural knowledge from logits, without requiring ground-truth labels or feature-space alignment. iCD leverages Gram matrices over decoupled local logit representations to enable student models to learn latent semantic structural patterns. Extensive experiments on benchmark datasets demonstrate the effectiveness of iCD across diverse teacher-student architectures, with particularly strong performance in fine-grained classification tasks---achieving a peak improvement of +5.08% over the baseline.

Published

2026-03-14

How to Cite

Xue, X., Ji, Y., Ren, Q.-D.-E.-J., Shi, B., Lu, M., Wu, N., … Cha, G.-Q.-Q.-G. (2026). iCD: An Implicit Clustering Distillation Method for Structural Information Mining (Student Abstract). Proceedings of the AAAI Conference on Artificial Intelligence, 40(48), 41436–41438. https://doi.org/10.1609/aaai.v40i48.42297