Hierarchical Structure-Property Alignment for Data-Efficient Molecular Generation and Editing

Authors

  • Ziyu Fan Central South University
  • Zhijian Huang Central South University
  • Yahan Li Central South University
  • Xiaowen Hu Central South University
  • Siyuan Shen Central South University
  • Yunliang Wang Xinjiang University
  • Zeyu Zhong Central South University
  • Shuhong Liu Central South University
  • Shuning Yang Central South University
  • Shangqian Wu Central South University
  • Min Wu Institute for Infocomm Research (I2R), A*STAR
  • Lei Deng Central South University

DOI:

https://doi.org/10.1609/aaai.v40i25.39245

Abstract

Property-constrained molecular generation and editing are crucial in AI-driven drug discovery but remain hindered by two factors: (i) capturing the complex relationships between molecular structures and multiple properties remains challenging, and (ii) the narrow coverage and incomplete annotations of molecular properties weaken the effectiveness of property-based models. To tackle these limitations, we propose HSPAG, a data-efficient framework featuring hierarchical structure–property alignment. By treating SMILES and molecular properties as complementary modalities, the model learns their relationships at atom, substructure, and whole-molecule levels. Moreover, we select representative samples through scaffold clustering and hard samples via an auxiliary variational auto-encoder (VAE), substantially reducing the required pre-training data. In addition, we incorporate a property relevance-aware masking mechanism and diversified perturbation strategies to enhance generation quality under sparse annotations. Experiments demonstrate that HSPAG captures fine-grained structure–property relationships and supports controllable generation under multiple property constraints. Two real-world case studies further validate the editing capabilities of HSPAG.

Downloads

Published

2026-03-14

How to Cite

Fan, Z., Huang, Z., Li, Y., Hu, X., Shen, S., Wang, Y., … Deng, L. (2026). Hierarchical Structure-Property Alignment for Data-Efficient Molecular Generation and Editing. Proceedings of the AAAI Conference on Artificial Intelligence, 40(25), 21029–21037. https://doi.org/10.1609/aaai.v40i25.39245

Issue

Section

AAAI Technical Track on Machine Learning II