FDC-Ground: Improving GRPO for GUI Grounding via Exponential Rewards and Fact-Aligned Pruning

Authors

  • Xiangjian Zeng School of Journalism and Communication, Xiamen University
  • Wenjing Li State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications
  • Qingqiang Wu School of Film, Xiamen University School of Informatics, Xiamen University Xiamen Key Laboratory of Intelligent Storage and Computing, Xiamen University Key Laboratory of Digital Protection and Intelligent Processing of Intangible Cultural Heritage of Fujian and Taiwan, Ministry of Culture and Tourism, Xiamen University Institute of Artificial Intelligence, Xiamen University
  • Liang Zhang Xiaohongshu Inc.

DOI:

https://doi.org/10.1609/aaai.v40i33.40038

Abstract

This paper presents FDC-Ground, a reinforcement learning framework that addresses the high-cost, low-signal challenge of GUI grounding training. The framework introduces two core contributions: (1) the Exponentially Decayed Distance Reward (EDDR), which provides resolution-robust and continuous feedback for position predictions, and (2) the Fact-Aligned Dynamic Completions Pruning (FDC-Pruning) strategy, which selectively retains completions whose advantage signs align with factual correctness, thereby reducing computational overhead while enhancing gradient quality and training stability. Using only 3.2K training samples and a single epoch, our 7B-parameter model achieves 88.3% and 91.0% accuracy on ScreenSpot and ScreenSpot-v2, outperforming several RL-based models such as UIShift and SE-GUI. Our 3B-parameter model based on Qwen2.5-VL-3B surpasses its original performance by +26.6%, demonstrating the effectiveness of our reward design and pruning strategy under low-resource conditions. Furthermore, the proposed FDC-Pruning strategy achieves a 1.18× training speedup and a +5.9% accuracy improvement over standard GRPO, and expanding the exploration space to 4× yields an additional +10.5% gain, confirming both the scalability and the training efficiency of our approach. These findings highlight that combining EDDR with FDC-Pruning offers a practical path toward scalable and efficient RL-based GUI grounding, even in low-resource settings.

Downloads

Published

2026-03-14

How to Cite

Zeng, X., Li, W., Wu, Q., & Zhang, L. (2026). FDC-Ground: Improving GRPO for GUI Grounding via Exponential Rewards and Fact-Aligned Pruning. Proceedings of the AAAI Conference on Artificial Intelligence, 40(33), 28122–28130. https://doi.org/10.1609/aaai.v40i33.40038

Issue

Section

AAAI Technical Track on Machine Learning X