PC-Flow: Preference Alignment in Flow Matching via Classifier

Authors

  • Shaomeng Wang Nanjing University of Science and Technology
  • He Wang Nanjing University of Science and Technology
  • Longquan Dai Nanjing University of Science and Technology
  • Jinhui Tang Nanjing Forestry University

DOI:

https://doi.org/10.1609/aaai.v40i12.37971

Abstract

Flow Matching (FM) is an efficient generative modeling framework, but aligning it with human preferences remains underexplored.~Although applying Direct Preference Optimization (DPO) to diffusion models has yielded improvements, directly extending DPO-like methods to FM poses three challenges: 1) Incompatibility with ODE-based models, 2) Heavy computational cost from full model fine-tuning, and 3) Reliance on reference model quality. To address these limitations, we propose Preference Classifier for Flow Matching (PC-Flow), a novel reference-free preference alignment framework. Specifically, we reinterpret FM’s deterministic ODE as an equivalent SDE to enable DPO-style learning. Then, we introduce a lightweight classifier to model relative preferences exclusively. This approach decouples alignment from the generative model, eliminating the need for costly fine-tuning or a reference model. Theoretically, PC-Flow guarantees consistent preference-guided distribution evolution, achieves a DPO-equivalent objective without a reference model, and progressively steers generation toward preferred outputs. Experiments show that PC-Flow achieves DPO-level alignment with significantly lower training costs.

Downloads

Published

2026-03-14

How to Cite

Wang, S., Wang, H., Dai, L., & Tang, J. (2026). PC-Flow: Preference Alignment in Flow Matching via Classifier. Proceedings of the AAAI Conference on Artificial Intelligence, 40(12), 10047–10055. https://doi.org/10.1609/aaai.v40i12.37971

Issue

Section

AAAI Technical Track on Computer Vision IX