Accurate Nucleic Acid-Binding Residue Identification Based Domain-Adaptive Protein Language Model and Explainable Geometric Deep Learning

Authors

  • Wenwu Zeng Hunan University
  • Liangrui Pan Hunan University
  • Boya Ji Hunan University
  • Liwen Xu Hunan University
  • Shaoliang Peng Hunan University

DOI:

https://doi.org/10.1609/aaai.v39i1.32086

Abstract

Protein-nucleic acid interactions play a fundamental and critical role in a wide range of life activities. Accurate identification of nucleic acid-binding residues helps to understand the intrinsic mechanisms of the interactions. However, the accuracy and interpretability of existing computational methods for recognizing nucleic acid-binding residues need to be further improved. Here, we propose a novel method called GeSite based the domain-adaptive protein language model and E(3)-equivariant graph neural network. Prediction results across multiple benchmark test sets demonstrate that GeSite is superior or comparable to state-of-the-art prediction methods. The MCC values of GeSite are 0.522 and 0.326 for the one DNA-binding residue test set and one RNA-binding resi-due test set, which are 0.57 and 38.14% higher than that of the second-best method, respectively. Detailed experi-mental results suggest that the advanced performance of GeSite lies in the well-designed nucleic acid-binding pro-tein adaptive language model. Additionally, interpretabil-ity analysis exposes the perception of the prediction mod-el on various remote and close functional domains, which is the source of its discernment ability.

Downloads

Published

2025-04-11

How to Cite

Zeng, W., Pan, L., Ji, B., Xu, L., & Peng, S. (2025). Accurate Nucleic Acid-Binding Residue Identification Based Domain-Adaptive Protein Language Model and Explainable Geometric Deep Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 39(1), 1004–1012. https://doi.org/10.1609/aaai.v39i1.32086

Issue

Section

AAAI Technical Track on Application Domains