TabGeoFlow: A Geometric Flow Matching Model for Tabular Data Synthesis

Authors

  • Jong In Choi Korea Credit Information Services

DOI:

https://doi.org/10.1609/aaai.v40i25.39192

Abstract

Tabular data synthesis is a key technique for protecting data privacy and addressing class imbalance, yet existing generative models struggle to capture the complex intrinsic structure of the data. To overcome this limitation, we propose TabGeoFlow, a novel geometric flow matching model for tabular data synthesis. The core innovation of TabGeoFlow is the injection of an explicit geometric inductive bias into the conditional flow matching framework. We decompose the learned vector field into local tangent and normal components of the data manifold. By dynamically suppressing the predicted normal component via a controlling loss function, we constrain the generative path to follow the data's intrinsic structure. Implemented with a shared backbone for parameter efficiency, TabGeoFlow achieves competitive or better fidelity and utility, while exhibiting near random black box MIA accuracy and DCR ≈ 50%, suggesting reduced memorization without sacrificing quality.

Published

2026-03-14

How to Cite

Choi, J. I. (2026). TabGeoFlow: A Geometric Flow Matching Model for Tabular Data Synthesis. Proceedings of the AAAI Conference on Artificial Intelligence, 40(25), 20562–20569. https://doi.org/10.1609/aaai.v40i25.39192

Issue

Section

AAAI Technical Track on Machine Learning II