scCluBench: Comprehensive Benchmarking of Clustering Algorithms for Single-Cell RNA Sequencing

Authors

  • Ping Xu Computer Network Information Center, Chinese Academy of Sciences, Beijing, China University of Chinese Academy of Sciences, Beijing, China
  • Zaitian Wang Computer Network Information Center, Chinese Academy of Sciences, Beijing, China University of Chinese Academy of Sciences, Beijing, China
  • Zhirui Wang Hangzhou Institute for Advanced Study,University of Chinese Academy of Sciences, Hangzhou, China University of Chinese Academy of Sciences, Beijing, China
  • Pengjiang Li Computer Network Information Center, Chinese Academy of Sciences, Beijing, China University of Chinese Academy of Sciences, Beijing, China
  • Jiajia Wang Computer Network Information Center, Chinese Academy of Sciences, Beijing, China
  • Ran Zhang Computer Network Information Center, Chinese Academy of Sciences, Beijing, China University of Chinese Academy of Sciences, Beijing, China
  • Pengfei Wang Computer Network Information Center, Chinese Academy of Sciences, Beijing, China Hangzhou Institute for Advanced Study,University of Chinese Academy of Sciences, Hangzhou, China University of Chinese Academy of Sciences, Beijing, China
  • Yuanchun Zhou Computer Network Information Center, Chinese Academy of Sciences, Beijing, China Hangzhou Institute for Advanced Study,University of Chinese Academy of Sciences, Hangzhou, China University of Chinese Academy of Sciences, Beijing, China

DOI:

https://doi.org/10.1609/aaai.v40i2.37110

Abstract

Cell clustering is crucial for uncovering cellular heterogeneity in single-cell RNA sequencing (scRNA-seq) data by identifying cell types and marker genes. Despite its importance, existing benchmarks for scRNA-seq clustering remain fragmented, lacking standardized protocols and often omitting recent advances in artificial intelligence.To fill these gaps, we present scCluBench, a comprehensive benchmark of clustering algorithms for scRNA-seq data. scCluBench provides 36 scRNA-seq datasets collected from diverse public sources, covering multiple tissues, which are uniformly processed to ensure consistency for systematic evaluation and downstream analyses. To assess performance, we collect and reproduce a range of scRNA-seq clustering methods, including traditional, deep learning-based, graph-based, and biological foundation models. We comprehensively evaluate each method both quantitatively and qualitatively, using core performance metrics and visualization analyses. Furthermore, we construct representative downstream biological tasks, such as marker gene identification and cell type annotation, to further assess the practical utility. scCluBench then investigates the performance differences and applicability boundaries of various clustering models across diverse analytical tasks, systematically assessing their robustness and scalability in real-world scenarios. Overall, scCluBench offers a standardized and user-friendly benchmark for scRNA-seq clustering, with standardized datasets, unified evaluation protocols, and transparent analyses, facilitating informed method selection and providing valuable insights into model generalizability and application scope.

Downloads

Published

2026-03-14

How to Cite

Xu, P., Wang, Z., Wang, Z., Li, P., Wang, J., Zhang, R., … Zhou, Y. (2026). scCluBench: Comprehensive Benchmarking of Clustering Algorithms for Single-Cell RNA Sequencing. Proceedings of the AAAI Conference on Artificial Intelligence, 40(2), 1364–1372. https://doi.org/10.1609/aaai.v40i2.37110

Issue

Section

AAAI Technical Track on Application Domains II