Multimodal Mixture-of-Experts with Retrieval Augmentation for Protein Active Site Identification

Authors

  • Jiayang Wu Westlake University
  • Jiale Zhou Westlake University
  • Rubo Wang University of the Chinese Academy of Sciences
  • Xingyi Zhang Mohamed bin Zayed University of Artificial Intelligence
  • Xun Lin Westlake University
  • Tianxu Lv Jiangnan University
  • Leong Hou U University of macau
  • Yefeng Zheng Westlake University

DOI:

https://doi.org/10.1609/aaai.v40i32.39903

Abstract

Accurate identification of protein active sites at the residue level is crucial for understanding protein function and advancing drug discovery. However, current methods face two critical challenges: vulnerability in single-instance prediction due to sparse training data, and inadequate modality reliability estimation that leads to performance degradation when unreliable modalities dominate fusion processes. To address these challenges, we introduce Multimodal Mixtureof-Experts with Retrieval Augmentation (MERA), the first retrieval-augmented framework for protein active site identification. MERA employs hierarchical multi-expert retrieval that dynamically aggregates contextual information from chain, sequence, and active-site perspectives through residuelevel mixture-of-experts gating. To prevent modality degradation, we propose a reliability-aware fusion strategy based on Dempster–Shafer evidence theory that quantifies modality trustworthiness through belief mass functions and learnable discounting coefficients, enabling principled multimodal integration. Extensive experiments on ProTAD-Gen and TS125 datasets demonstrate that MERA achieves state-of-the-art performance, with 90% AUPRC on active site prediction and significant gains on peptide-binding site identification, validating the effectiveness of retrieval-augmented multi-expert modeling and reliability-guided fusion

Published

2026-03-14

How to Cite

Wu, J., Zhou, J., Wang, R., Zhang, X., Lin, X., Lv, T., … Zheng, Y. (2026). Multimodal Mixture-of-Experts with Retrieval Augmentation for Protein Active Site Identification. Proceedings of the AAAI Conference on Artificial Intelligence, 40(32), 26913–26921. https://doi.org/10.1609/aaai.v40i32.39903

Issue

Section

AAAI Technical Track on Machine Learning IX