Regressor-guided Diffusion Model for De Novo Peptide Sequencing with Explicit Mass Control

Authors

  • Shaorong Chen Zhejiang University, Hangzhou, China, 310058 AI Lab, Westlake University, Hangzhou, China, 310030
  • Jingbo Zhou Zhejiang University, Hangzhou, China, 310058 AI Lab, Westlake University, Hangzhou, China, 310030
  • Jun Xia AIMS Lab, The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China, 511453 The Hong Kong University of Science and Technology, Hong Kong, China, 999077

DOI:

https://doi.org/10.1609/aaai.v40i1.36968

Abstract

The discovery of novel proteins relies on sensitive protein identification, for which de novo peptide sequencing (DNPS) from mass spectra is a crucial approach. While deep learning has advanced DNPS, existing models inadequately enforce the fundamental mass consistency constraint—that a predicted peptide's mass must match the experimental measured precursor mass. Previous DNPS methods often treat this critical information as a simple input feature or use it in post-processing, leading to numerous implausible predictions that do not adhere to this fundamental physical property. To address this limitation, we introduce DiffuNovo, a novel regressor-guided diffusion model for de novo peptide sequencing that provides explicit peptide-level mass control. Our approach integrates the mass constraint at two critical stages: during training, a novel peptide-level mass loss guides model optimization, while at inference, regressor-based guidance from gradient-based updates in the latent space steers the generation to compel the predicted peptide adheres to the mass constraint. Comprehensive evaluations on established benchmarks demonstrate that DiffuNovo surpasses state-of-the-art methods in DNPS accuracy. Additionally, as the first DNPS model to employ a diffusion model as its core backbone, DiffuNovo leverages the powerful controllability of diffusion architecture and achieves a significant reduction in mass error, thereby producing much more physically plausible peptides. These innovations represent a substantial advancement toward robust and broadly applicable DNPS. The source code is available in the supplementary material.

Downloads

Published

2026-03-14

How to Cite

Chen, S., Zhou, J., & Xia, J. (2026). Regressor-guided Diffusion Model for De Novo Peptide Sequencing with Explicit Mass Control. Proceedings of the AAAI Conference on Artificial Intelligence, 40(1), 92–100. https://doi.org/10.1609/aaai.v40i1.36968

Issue

Section

AAAI Technical Track on Application Domains I