FDC-Ground: Improving GRPO for GUI Grounding via Exponential Rewards and Fact-Aligned Pruning
DOI:
https://doi.org/10.1609/aaai.v40i33.40038Abstract
This paper presents FDC-Ground, a reinforcement learning framework that addresses the high-cost, low-signal challenge of GUI grounding training. The framework introduces two core contributions: (1) the Exponentially Decayed Distance Reward (EDDR), which provides resolution-robust and continuous feedback for position predictions, and (2) the Fact-Aligned Dynamic Completions Pruning (FDC-Pruning) strategy, which selectively retains completions whose advantage signs align with factual correctness, thereby reducing computational overhead while enhancing gradient quality and training stability. Using only 3.2K training samples and a single epoch, our 7B-parameter model achieves 88.3% and 91.0% accuracy on ScreenSpot and ScreenSpot-v2, outperforming several RL-based models such as UIShift and SE-GUI. Our 3B-parameter model based on Qwen2.5-VL-3B surpasses its original performance by +26.6%, demonstrating the effectiveness of our reward design and pruning strategy under low-resource conditions. Furthermore, the proposed FDC-Pruning strategy achieves a 1.18× training speedup and a +5.9% accuracy improvement over standard GRPO, and expanding the exploration space to 4× yields an additional +10.5% gain, confirming both the scalability and the training efficiency of our approach. These findings highlight that combining EDDR with FDC-Pruning offers a practical path toward scalable and efficient RL-based GUI grounding, even in low-resource settings.Downloads
Published
2026-03-14
How to Cite
Zeng, X., Li, W., Wu, Q., & Zhang, L. (2026). FDC-Ground: Improving GRPO for GUI Grounding via Exponential Rewards and Fact-Aligned Pruning. Proceedings of the AAAI Conference on Artificial Intelligence, 40(33), 28122–28130. https://doi.org/10.1609/aaai.v40i33.40038
Issue
Section
AAAI Technical Track on Machine Learning X