ProAR: Probabilistic Autoregressive Modeling for Molecular Dynamics

Authors

  • Kaiwen Cheng School of Electronic and Computer Engineering, Peking University, Shenzhen, China AI for Science (AI4S)-Preferred Program, Peking University Shenzhen Graduate School, China
  • Yutian Liu School of Computer Science, Peking University, Beijing, China
  • Zhiwei Nie School of Electronic and Computer Engineering, Peking University, Shenzhen, China AI for Science (AI4S)-Preferred Program, Peking University Shenzhen Graduate School, China
  • Mujie Lin School of Electronic and Computer Engineering, Peking University, Shenzhen, China AI for Science (AI4S)-Preferred Program, Peking University Shenzhen Graduate School, China
  • Yanzhen Hou School of Physics, Peking University, Beijing, China
  • Yiheng Tao School of Computer Science, Peking University, Beijing, China
  • Chang Liu Department of Automation and BNRist, Tsinghua University, Beijing, China
  • Jie Chen School of Electronic and Computer Engineering, Peking University, Shenzhen, China AI for Science (AI4S)-Preferred Program, Peking University Shenzhen Graduate School, China
  • Youdong Mao School of Physics, Peking University, Beijing, China Peking-Tsinghua Joint Center for Life Sciences, Peking University, Beijing, China Center for Quantitative Biology, Peking University, Beijing, China National Biomedical Imaging Center, Peking University, Beijing, China AI for Science (AI4S)-Preferred Program, Peking University Shenzhen Graduate School, China
  • Yonghong Tian School of Electronic and Computer Engineering, Peking University, Shenzhen, China School of Computer Science, Peking University, Beijing, China AI for Science (AI4S)-Preferred Program, Peking University Shenzhen Graduate School, China

DOI:

https://doi.org/10.1609/aaai.v40i1.36974

Abstract

Understanding the structural dynamics of biomolecules is crucial for uncovering biological functions. As molecular dynamics (MD) simulation data becomes more available, deep generative models have been developed to synthesize realistic MD trajectories. However, existing methods produce fixed-length trajectories by jointly denoising high-dimensional spatiotemporal representations, which conflicts with MD’s frame-by-frame integration process and fails to capture time-dependent conformational diversity. Inspired by MD's sequential nature, we introduce a new probabilistic autoregressive (ProAR) framework for trajectory generation. ProAR uses a dual-network system that models each frame as a multivariate Gaussian distribution and employs an anti-drifting sampling strategy to reduce cumulative errors. This approach captures conformational uncertainty and time-coupled structural changes while allowing flexible generation of trajectories of arbitrary length. Experiments on ATLAS, a large-scale protein MD dataset, demonstrate that for long trajectory generation, our model achieves a 7.5% reduction in reconstruction RMSE and an average 25.8% improvement in conformation change accuracy compared to previous state-of-the-art methods. For conformation sampling task, it performs comparably to specialized time-independent models, providing a flexible and dependable alternative to standard MD simulations.

Downloads

Published

2026-03-14

How to Cite

Cheng, K., Liu, Y., Nie, Z., Lin, M., Hou, Y., Tao, Y., Liu, C., Chen, J., Mao, Y., & Tian, Y. (2026). ProAR: Probabilistic Autoregressive Modeling for Molecular Dynamics. Proceedings of the AAAI Conference on Artificial Intelligence, 40(1), 147-155. https://doi.org/10.1609/aaai.v40i1.36974

Issue

Section

AAAI Technical Track on Application Domains I