Learning Cell-Aware Hierarchical Multi-Modal Representations for Robust Molecular Modeling

Authors

  • Mengran Li School of Intelligent Systems Engineering, Sun Yat-sen University Westlake University Center for Artificial Intelligence and Robotics (CAIR), Hong Kong Institute of Science and Innovation (HKISI)
  • Zelin Zang Westlake University Center for Artificial Intelligence and Robotics (CAIR), Hong Kong Institute of Science and Innovation (HKISI) Center for Integrated Circuits and Artificial Intelligence, Tsientang Institute for Advanced Study
  • Wenbin Xing School of Intelligent Systems Engineering, Sun Yat-sen University
  • Junzhou Chen School of Intelligent Systems Engineering, Sun Yat-sen University
  • Ronghui Zhang School of Intelligent Systems Engineering, Sun Yat-sen University
  • Jiebo Luo Center for Artificial Intelligence and Robotics (CAIR), Hong Kong Institute of Science and Innovation (HKISI)
  • Stan Z. Li Westlake University

DOI:

https://doi.org/10.1609/aaai.v40i1.37027

Abstract

Understanding how chemical perturbations propagate through biological systems is essential for robust molecular property prediction. While most existing methods focus on chemical structures alone, recent advances highlight the crucial role of cellular responses such as morphology and gene expression in shaping drug effects. However, current cell-aware approaches face two key limitations: (1) modality incompleteness in external biological data, and (2) insufficient modeling of hierarchical dependencies across molecular, cellular, and genomic levels. We propose CHMR (Cell-aware Hierarchical Multi-Modal Representations), a robust framework that jointly models local-global dependencies between molecules and cellular responses and captures latent biological hierarchies via a novel tree-structured vector quantization module. Evaluated on public benchmarks spanning 696 tasks, CHMR outperforms state-of-the-art baselines, yielding average improvements of 3.6% on classification and 17.2% on regression tasks. These results demonstrate the advantage of hierarchy-aware, multi-modal learning for reliable and biologically grounded molecular representations, offering a generalizable framework for integrative biomedical modeling.

Published

2026-03-14

How to Cite

Li, M., Zang, Z., Xing, W., Chen, J., Zhang, R., Luo, J., & Li, S. Z. (2026). Learning Cell-Aware Hierarchical Multi-Modal Representations for Robust Molecular Modeling. Proceedings of the AAAI Conference on Artificial Intelligence, 40(1), 623–631. https://doi.org/10.1609/aaai.v40i1.37027

Issue

Section

AAAI Technical Track on Application Domains I