From Bias to Breakdown: Benchmarking Failure Mode Analysis
of Single-cell RNA Sequencing Foundation Models in Acute
Myeloid Leukemia

Amirreza Naziri; Arash Asgari; Aijun An; Eleftherios Sachlos; Laleh Seyyed-Kalantari

doi:10.1609/aaaiss.v7i1.36931

Authors

Amirreza Naziri York University Vector Institute Connected Minds
Arash Asgari York University Vector Institute
Aijun An York University Connected Minds
Eleftherios Sachlos York University Connected Minds
Laleh Seyyed-Kalantari York University Vector Institute Connected Minds

DOI:

https://doi.org/10.1609/aaaiss.v7i1.36931

Abstract

Foundation models (FMs) trained on large-scale single-cell RNA-seq (scRNA‐seq) data have shown strong performance across various biological tasks. These performances are often reported across a large set of test benchmarks across all samples. However, the pretraining data of these models are often highly imbalanced across disease types, patients' conditions, and demographics. For instance, disease samples are rarer and more challenging to collect, and the pretraining sets contain many more healthy cells. Such imbalances can hurt performance on underrepresented disease cases and the equality of the model outcome. To evaluate this hypothesis, we benchmark off-the-shelf scRNA-seq foundation models for cell-type classification in acute myeloid leukemia (AML), a rare but clinically important disease that represents low-prevalence settings. Here, besides overall performance, we conduct subgroup analysis of the outcome across cell types and disease conditions (clinical timepoints). Our results suggest that despite high overall F1 scores in cell-type classification, performance drops in disease conditions and varies across cell types. These findings highlight a limitation of current scRNA-seq foundation models and motivate more balanced pretraining and failure mode analysis rather than an overall performance report.

From Bias to Breakdown: Benchmarking Failure Mode Analysis of Single-cell RNA Sequencing Foundation Models in Acute Myeloid Leukemia

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information