S²Drug: Bridging Protein Sequence and 3D Structure in Contrastive Representation Learning for Virtual Screening

Authors

  • Bowei He City University of Hong Kong
  • Bowen Gao Institute for AI Industry Research (AIR), Tsinghua University
  • Yankai Chen University of Illinois Chicago
  • Yanyan Lan Institute for AI Industry Research (AIR), Tsinghua University Beijing Academy of Artificial Intelligence
  • Chen Ma City University of Hong Kong
  • Philip S. Yu University of Illinois Chicago
  • Ya-Qin Zhang Institute for AI Industry Research (AIR), Tsinghua University
  • Wei-Ying Ma Institute for AI Industry Research (AIR), Tsinghua University

DOI:

https://doi.org/10.1609/aaai.v40i1.36997

Abstract

Virtual screening (VS) is an essential task in drug discovery, focusing on the identification of small-molecule ligands that bind to specific protein pockets. Existing deep learning methods, from early regression models to recent contrastive learning approaches, primarily rely on structural data while overlooking protein sequences, which are more accessible and can enhance generalizability. However, directly integrating protein sequences poses challenges due to the redundancy and noise in large-scale protein-ligand datasets. To address these limitations, we propose S²Drug, a two-stage framework that explicitly incorporates protein Sequence information and 3D Structure context in protein-ligand contrastive representation learning. In the first stage, we perform protein sequence pretraining on ChemBL using an ESM2-based backbone, combined with a tailored data sampling strategy to reduce redundancy and noise on both protein and ligand sides. In the second stage, we fine-tune on PDBBind by fusing sequence and structure information through a residue-level gating module, while introducing an auxiliary binding site prediction task. This auxiliary task guides the model to accurately localize binding residues within the protein sequence and capture their 3D spatial arrangement, thereby refining protein-ligand matching. Across multiple benchmarks, S²Drug consistently improves virtual screening performance and achieves strong results on binding site prediction, demonstrating the value of bridging sequence and structure in contrastive learning.

Downloads

Published

2026-03-14

How to Cite

He, B., Gao, B., Chen, Y., Lan, Y., Ma, C., Yu, P. S., … Ma, W.-Y. (2026). S²Drug: Bridging Protein Sequence and 3D Structure in Contrastive Representation Learning for Virtual Screening. Proceedings of the AAAI Conference on Artificial Intelligence, 40(1), 354–362. https://doi.org/10.1609/aaai.v40i1.36997

Issue

Section

AAAI Technical Track on Application Domains I