CryoDomain: Sequence-free Protein Domain Identification from Low-resolution Cryo-EM Density Maps

Authors

  • Muzhi Dai Tsinghua University
  • Zhuoer Dong Tsinghua University
  • Weining Fu Tsinghua University
  • Kui Xu Tsinghua University
  • Qiangfeng Cliff Zhang Tsinghua University

DOI:

https://doi.org/10.1609/aaai.v39i1.31987

Abstract

Cryo-electron microscopy (cryo-EM) has revolutionized the field of structural biology, determining structures of large protein machines and sharpening the understanding of fundamental biological processes. Despite cryo-EM’s unique capacity to discover novel proteins from unpurified samples and reveal the intricate structures of protein complexes within native cellular environments, the advancement of protein identification methods for cryo-EM lags behind. Without prior knowledge, such as sequence, protein identification from low-resolution density maps remains challenging. Here we introduce CryoDomain, an innovative method for identifying protein domains — conserved constituent units of proteins — from low-resolution cryo-EM density maps without requiring prior knowledge of protein sequences. CryoDomain leverages cross-modal alignment to correlate cryo-EM density maps with atomic structures, transferring the knowledge learned on a large atomic structure dataset to a sparse density map dataset. On two protein domain benchmarks constructed from CATH and SCOPe, CryoDomain significantly outperforms the state-of-the-art methods for domain identification from low-resolution density maps. CryoDomain liberates structural biologists from the tedious tasks of density inspection and database searching during protein identification. It has the potential to extend the border of unbiased structure discovery and cellular landscape investigation using cryo-EM.

Published

2025-04-11

How to Cite

Dai, M., Dong, Z., Fu, W., Xu, K., & Zhang, Q. C. (2025). CryoDomain: Sequence-free Protein Domain Identification from Low-resolution Cryo-EM Density Maps. Proceedings of the AAAI Conference on Artificial Intelligence, 39(1), 119-127. https://doi.org/10.1609/aaai.v39i1.31987

Issue

Section

AAAI Technical Track on Application Domains