Compositional Attribute Imbalance in Vision Datasets

Authors

  • Yanbiao Ma Gaoling School of Artificial Intelligence Renmin University of China Beijing, China Beijing Key Laboratory of Research on Large Models and Intelligent Governance Engineering Research Center of Next-Generation Intelligent Search and Recommendation, MOE
  • Jiayi Chen Xidian University
  • Wei Dai Xidian University
  • Dong Zhao Xidian University
  • Zeyu Zhang The Australian National University
  • Yuting Yang Xidian University
  • Bowei Liu Tsinghua University
  • Jiaxuan Zhao Xidian University
  • Andi Zhang University of Manchester

DOI:

https://doi.org/10.1609/aaai.v40i10.37727

Abstract

Visual attribute imbalance is a common yet underexplored issue in image classification, significantly impacting model performance and generalization. In this work, we first define the first-level and second-level attributes of images and then introduce a CLIP-based framework to construct a visual attribute dictionary, enabling automatic evaluation of image attributes. By systematically analyzing both single-attribute imbalance and compositional attribute imbalance, we reveal how the rarity of attributes affects model performance. To tackle these challenges, we propose adjusting the sampling probability of samples based on the rarity of their compositional attributes. This strategy is further integrated with various data augmentation techniques (such as CutMix, Fmix, and SaliencyMix) to enhance the model's ability to represent rare attributes. Extensive experiments on benchmark datasets demonstrate that our method effectively mitigates attribute imbalance, thereby improving the robustness and fairness of deep neural networks. Our research highlights the importance of modeling visual attribute distributions and provides a scalable solution for long-tail image classification tasks.

Downloads

Published

2026-03-14

How to Cite

Ma, Y., Chen, J., Dai, W., Zhao, D., Zhang, Z., Yang, Y., Liu, B., Zhao, J., & Zhang, A. (2026). Compositional Attribute Imbalance in Vision Datasets. Proceedings of the AAAI Conference on Artificial Intelligence, 40(10), 7836-7846. https://doi.org/10.1609/aaai.v40i10.37727

Issue

Section

AAAI Technical Track on Computer Vision VII