Energy-Motivated Equivariant Pretraining for 3D Molecular Graphs

Authors

  • Rui Jiao Beijing National Research Center for Information Science and Technology (BNRist), Department of Computer Science and Technology, Tsinghua University Institute for AI Industry Research (AIR), Tsinghua University
  • Jiaqi Han Beijing National Research Center for Information Science and Technology (BNRist), Department of Computer Science and Technology, Tsinghua University Institute for AI Industry Research (AIR), Tsinghua University
  • Wenbing Huang Gaoling School of Artificial Intelligence, Renmin University of China Beijing Key Laboratory of Big Data Management and Analysis Methods
  • Yu Rong Tencent AI Lab
  • Yang Liu Beijing National Research Center for Information Science and Technology (BNRist), Department of Computer Science and Technology, Tsinghua University Institute for AI Industry Research (AIR), Tsinghua University Beijing Academy of Artificial Intelligence

DOI:

https://doi.org/10.1609/aaai.v37i7.25978

Keywords:

ML: Graph-based Machine Learning, ML: Unsupervised & Self-Supervised Learning

Abstract

Pretraining molecular representation models without labels is fundamental to various applications. Conventional methods mainly process 2D molecular graphs and focus solely on 2D tasks, making their pretrained models incapable of characterizing 3D geometry and thus defective for downstream 3D tasks. In this work, we tackle 3D molecular pretraining in a complete and novel sense. In particular, we first propose to adopt an equivariant energy-based model as the backbone for pretraining, which enjoys the merits of fulfilling the symmetry of 3D space. Then we develop a node-level pretraining loss for force prediction, where we further exploit the Riemann-Gaussian distribution to ensure the loss to be E(3)-invariant, enabling more robustness. Moreover, a graph-level noise scale prediction task is also leveraged to further promote the eventual performance. We evaluate our model pretrained from a large-scale 3D dataset GEOM-QM9 on two challenging 3D benchmarks: MD17 and QM9. Experimental results demonstrate the efficacy of our method against current state-of-the-art pretraining approaches, and verify the validity of our design for each proposed component. Code is available at https://github.com/jiaor17/3D-EMGP.

Downloads

Published

2023-06-26

How to Cite

Jiao, R., Han, J., Huang, W., Rong, Y., & Liu, Y. (2023). Energy-Motivated Equivariant Pretraining for 3D Molecular Graphs. Proceedings of the AAAI Conference on Artificial Intelligence, 37(7), 8096-8104. https://doi.org/10.1609/aaai.v37i7.25978

Issue

Section

AAAI Technical Track on Machine Learning II