FT-NCFM: An Influence-Aware Data Distillation Framework for Efficient VLA Models

Kewei Chen; Yayu Long; Shuai Li; Mingsheng Shang

doi:10.1609/aaai.v40i22.38880

Authors

Kewei Chen Chongqing Institute of Green and Intelligent Technology, Chinese Academy of Sciences
Yayu Long Chongqing Institute of Green and Intelligent Technology, Chinese Academy of Sciences
Shuai Li Faculty of Information Technology and Electrical Engineering, University of Oulu, Finland
Mingsheng Shang Chongqing Institute of Green and Intelligent Technology, Chinese Academy of Sciences

DOI:

https://doi.org/10.1609/aaai.v40i22.38880

Abstract

The powerful generalization of Vision-Language-Action (VLA) models is bottlenecked by their heavy reliance on massive, redundant, and unevenly valued datasets, hindering their widespread application. Existing model-centric optimization paths, such as model compression (which often leads to performance degradation) or policy distillation (whose products are model-dependent and lack generality), fail to fundamentally address this data-level challenge. To this end, this paper introduces FT-NCFM, a fundamentally different, data-centric generative data distillation framework. Our framework employs a self-contained Fact-Tracing (FT) engine that combines causal attribution with programmatic contrastive verification to assess the intrinsic value of samples. Guided by these assessments, an adversarial NCFM process synthesizes a model-agnostic, information-dense, and reusable data asset. Experimental results on several mainstream VLA benchmarks show that models trained on just 5\% of our distilled coreset achieve a success rate of 85-90\% compared with training on the full dataset, while reducing training time by over 80\%. Our work demonstrates that intelligent data distillation is a highly promising new path for building efficient, high-performance VLA models.

FT-NCFM: An Influence-Aware Data Distillation Framework for Efficient VLA Models

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information