scMBERT: A Pre-Trained Deep Learning Model for Single-Cell Multiomic Data Representation and Prediction (Student Abstract)

Authors

  • Xiaojian Chen Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health Department of Biomedical Engineering, Johns Hopkins University
  • Kuai Yu Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health Department of Biomedical Engineering, Johns Hopkins University
  • Min-Zhi Jiang Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health
  • Cihan Xiao Center for Language and Speech Processing, Johns Hopkins University
  • Ziqi Fu Department of Biostatistics, Harvard T.H. Chan School of Public Health
  • Weiqiang Zhou Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health

DOI:

https://doi.org/10.1609/aaai.v39i28.35242

Abstract

Recent advancements in single-cell sequencing technologies enable the measurement of multiple modalities in individual cells, offering insights into the transcriptome and regulome in various biological systems and human diseases in an unprecedented resolution. However, effectively using these ultra-high-dimensional and large-scale multiomic data to understand gene regulation remains challenging. Inspired by the success of adapting large language models into the genomics field, we develop scMBERT, a BERT framework-based pre-trained deep learning model using single-cell multiomic data. We showed that scMBERT increases model flexibility and performance in downstream tasks like cell type annotation and batch-effect correction, demonstrating the potential of leveraging multiomic data to improve single-cell genomic data analyses.

Published

2025-04-11

How to Cite

Chen, X., Yu, K., Jiang, M.-Z., Xiao, C., Fu, Z., & Zhou, W. (2025). scMBERT: A Pre-Trained Deep Learning Model for Single-Cell Multiomic Data Representation and Prediction (Student Abstract). Proceedings of the AAAI Conference on Artificial Intelligence, 39(28), 29334-29336. https://doi.org/10.1609/aaai.v39i28.35242