Improving PTM Site Prediction by Coupling of Multi-Granularity Structure and Multi-Scale Sequence Representation

Authors

  • Zhengyi Li College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
  • Menglu Li College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
  • Lida Zhu College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
  • Wen Zhang College of Informatics, Huazhong Agricultural University, Wuhan 430070, China; Hubei Key Laboratory of Agricultural Bioinformatics, Huazhong Agricultural University, Wuhan 430070, China; Engineering Research Center of Intelligent Technology for Agriculture, Ministry of Education, Wuhan 430070,China

DOI:

https://doi.org/10.1609/aaai.v38i1.27770

Keywords:

APP: Natural Sciences, ML: Applications, ML: Representation Learning

Abstract

Protein post-translational modification (PTM) site prediction is a fundamental task in bioinformatics. Several computational methods have been developed to predict PTM sites. However, existing methods ignore the structure information and merely utilize protein sequences. Furthermore, designing a more fine-grained structure representation learning method is urgently needed as PTM is a biological event that occurs at the atom granularity. In this paper, we propose a PTM site prediction method by Coupling of Multi-Granularity structure and Multi-Scale sequence representation, PTM-CMGMS for brevity. Specifically, multigranularity structure-aware representation learning is designed to learn neighborhood structure representations at the amino acid, atom, and whole protein granularity from AlphaFold predicted structures, followed by utilizing contrastive learning to optimize the structure representations. Additionally, multi-scale sequence representation learning is used to extract context sequence information, and motif generated by aligning all context sequences of PTM sites assists the prediction. Extensive experiments on three datasets show that PTM-CMGMS outperforms the state-of-the-art methods. Source code can be found at https://github.com/LZY-HZAU/PTM-CMGMS.

Published

2024-03-25

How to Cite

Li, Z., Li, M., Zhu, L., & Zhang, W. (2024). Improving PTM Site Prediction by Coupling of Multi-Granularity Structure and Multi-Scale Sequence Representation. Proceedings of the AAAI Conference on Artificial Intelligence, 38(1), 188-196. https://doi.org/10.1609/aaai.v38i1.27770

Issue

Section

AAAI Technical Track on Application Domains