PC-Flow: Preference Alignment in Flow Matching via Classifier

Shaomeng Wang; He Wang; Longquan Dai; Jinhui Tang

doi:10.1609/aaai.v40i12.37971

Authors

Shaomeng Wang Nanjing University of Science and Technology
He Wang Nanjing University of Science and Technology
Longquan Dai Nanjing University of Science and Technology
Jinhui Tang Nanjing Forestry University

DOI:

https://doi.org/10.1609/aaai.v40i12.37971

Abstract

Flow Matching (FM) is an efficient generative modeling framework, but aligning it with human preferences remains underexplored.~Although applying Direct Preference Optimization (DPO) to diffusion models has yielded improvements, directly extending DPO-like methods to FM poses three challenges: 1) Incompatibility with ODE-based models, 2) Heavy computational cost from full model fine-tuning, and 3) Reliance on reference model quality. To address these limitations, we propose Preference Classifier for Flow Matching (PC-Flow), a novel reference-free preference alignment framework. Specifically, we reinterpret FM’s deterministic ODE as an equivalent SDE to enable DPO-style learning. Then, we introduce a lightweight classifier to model relative preferences exclusively. This approach decouples alignment from the generative model, eliminating the need for costly fine-tuning or a reference model. Theoretically, PC-Flow guarantees consistent preference-guided distribution evolution, achieves a DPO-equivalent objective without a reference model, and progressively steers generation toward preferred outputs. Experiments show that PC-Flow achieves DPO-level alignment with significantly lower training costs.

PC-Flow: Preference Alignment in Flow Matching via Classifier

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information