MLC-NC: Long-Tailed Multi-Label Image Classification Through the Lens of Neural Collapse

Authors

  • Zijian Tao Nanjing University of Aeronautics and Astronautics MIIT Key Laboratory of Pattern Analysis and Machine Intelligence
  • Shao-Yuan Li Nanjing University of Aeronautics and Astronautics MIIT Key Laboratory of Pattern Analysis and Machine Intelligence State Key Laboratory for Novel Software Technology, Nanjing University
  • Wenhai Wan Huazhong University of Science and Technology
  • Jinpeng Zheng Nanjing University of Aeronautics and Astronautics MIIT Key Laboratory of Pattern Analysis and Machine Intelligence
  • Jia-Yao Chen Nanjing University of Aeronautics and Astronautics MIIT Key Laboratory of Pattern Analysis and Machine Intelligence
  • Yuchen Li HoHai University
  • Sheng-Jun Huang Nanjing University of Aeronautics and Astronautics MIIT Key Laboratory of Pattern Analysis and Machine Intelligence
  • Songcan Chen Nanjing University of Aeronautics and Astronautics MIIT Key Laboratory of Pattern Analysis and Machine Intelligence

DOI:

https://doi.org/10.1609/aaai.v39i19.34298

Abstract

Long-tailed (LT) data distribution is common in multi-label image classification (MLC) and can significantly impact the performance of classification models. One reason is the challenge of learning unbiased instance representations (i.e. features) for imbalanced datasets. Additionally, the co-occurrence of head/tail classes within the same instance, along with complex label dependencies, introduces further challenges. In this work, we delve into this problem through the lens of neural collapse (NC). NC refers to a phenomenon where the last-layer features and classifier of a deep neural network model exhibit a simplex Equiangular Tight Frame (ETF) structure during its terminal training phase. This structure creates an optimal linearly separable state. However, this phenomenon typically occurs in balanced datasets but rarely applies to the typical imbalanced problem. To induce NC properties under Long-tailed multi-label classification (LT-MLC) conditions, we propose an approach named MLC-NC, which aims to learn high-quality data representations and improve the model’s generalization ability. Specifically, MLC-NC accounts for the fact that different labels correspond to different feature parts located in images. MLC-NC extracts class-wise features from each instance through a cross-attention mechanism. To guide the features toward the ETF structure, we introduce visual-semantic feature alignment with a fixed ETF structured label embedding, which helps to learn evenly distributed class centers. To reduce within-class feature variation, we introduce collapse calibration within a lower-dimensional feature space. To mitigate classification bias, we concatenate features and feed them into a binarized fixed ETF classifier. As an orthogonal approach to existing methods, MLC-NC can be seamlessly integrated into various frameworks. Extensive experiments on widely-used benchmarks demonstrate the effectiveness of our method.

Downloads

Published

2025-04-11

How to Cite

Tao, Z., Li, S.-Y., Wan, W., Zheng, J., Chen, J.-Y., Li, Y., … Chen, S. (2025). MLC-NC: Long-Tailed Multi-Label Image Classification Through the Lens of Neural Collapse. Proceedings of the AAAI Conference on Artificial Intelligence, 39(19), 20850–20858. https://doi.org/10.1609/aaai.v39i19.34298

Issue

Section

AAAI Technical Track on Machine Learning V