T-distributed Spherical Feature Representation for Imbalanced Classification

Authors

  • Xiaoyu Yang College of Electronics and Information Engineering, Tongji University, Shanghai, China
  • Yufei Chen College of Electronics and Information Engineering, Tongji University, Shanghai, China
  • Xiaodong Yue School of Computer Engineering and Science, Shanghai University, Shanghai, China Artificial Intelligence Institute of Shanghai University, Shanghai, China VLN Lab, NAVI MedTech Co., Ltd. Shanghai, China
  • Shaoxun Xu College of Electronics and Information Engineering, Tongji University, Shanghai, China
  • Chao Ma Department of Radiology, Changhai Hospital of Shanghai, Shanghai, China

DOI:

https://doi.org/10.1609/aaai.v37i9.26284

Keywords:

ML: Classification and Regression, CV: Medical and Biological Imaging, ML: Deep Neural Network Algorithms

Abstract

Real-world classification tasks often show an extremely imbalanced problem. The extreme imbalance will cause a strong bias that the decision boundary of the classifier is completely dominated by the categories with abundant samples, which are also called the head categories. Current methods have alleviated the imbalanced impact from mainly three aspects: class re-balance, decoupling and domain adaptation. However, the existing criterion with the winner-take-all strategy still leads to the crowding problem in the eigenspace. The head categories with many samples can extract features more accurately, but occupy most of the eigenspace. The tail categories sharing the rest of the narrow eigenspace are too crowded together to accurately extract features. Above these issues, we propose a novel T-distributed spherical metric for equalized eigenspace in the imbalanced classification, which has the following innovations: 1) We design the T-distributed spherical metric, which has the characteristics of high kurtosis. Instead of the winner-take-all strategy, the T-distributed spherical metric produces a high logit only when the extracted feature is close enough to the category center, without a strong bias against other categories. 2) The T-distributed spherical metric is integrated into the classifier, which is able to equalize the eigenspace for alleviating the crowding issue in the imbalanced problem. The equalized eigenspace by the T-distributed spherical classifier is capable of improving the accuracy of the tail categories while maintaining the accuracy of the head, which significantly promotes the intraclass compactness and interclass separability of features. Extensive experiments on large-scale imbalanced datasets verify our method, which shows superior results in the long-tailed CIFAR-100/-10 with the imbalanced ratio IR = 100/50. Our method also achieves excellent results on the large-scale ImageNet-LT dataset and the iNaturalist dataset with various backbones. In addition, we provide a case study of the real clinical classification of pancreatic tumor subtypes with 6 categories. Among them, the largest number of PDAC accounts for 315 cases, and the least CP has only 8 cases. After 4-fold cross-validation, we achieved a top-1 accuracy of 69.04%.

Downloads

Published

2023-06-26

How to Cite

Yang, X., Chen, Y., Yue, X., Xu, S., & Ma, C. (2023). T-distributed Spherical Feature Representation for Imbalanced Classification. Proceedings of the AAAI Conference on Artificial Intelligence, 37(9), 10825-10833. https://doi.org/10.1609/aaai.v37i9.26284

Issue

Section

AAAI Technical Track on Machine Learning IV