FDC-Ground: Improving GRPO for GUI Grounding via Exponential Rewards and Fact-Aligned Pruning

Xiangjian Zeng; Wenjing Li; Qingqiang Wu; Liang Zhang

doi:10.1609/aaai.v40i33.40038

Authors

Xiangjian Zeng School of Journalism and Communication, Xiamen University
Wenjing Li State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications
Qingqiang Wu School of Film, Xiamen University School of Informatics, Xiamen University Xiamen Key Laboratory of Intelligent Storage and Computing, Xiamen University Key Laboratory of Digital Protection and Intelligent Processing of Intangible Cultural Heritage of Fujian and Taiwan, Ministry of Culture and Tourism, Xiamen University Institute of Artificial Intelligence, Xiamen University
Liang Zhang Xiaohongshu Inc.

DOI:

https://doi.org/10.1609/aaai.v40i33.40038

Abstract

This paper presents FDC-Ground, a reinforcement learning framework that addresses the high-cost, low-signal challenge of GUI grounding training. The framework introduces two core contributions: (1) the Exponentially Decayed Distance Reward (EDDR), which provides resolution-robust and continuous feedback for position predictions, and (2) the Fact-Aligned Dynamic Completions Pruning (FDC-Pruning) strategy, which selectively retains completions whose advantage signs align with factual correctness, thereby reducing computational overhead while enhancing gradient quality and training stability. Using only 3.2K training samples and a single epoch, our 7B-parameter model achieves 88.3% and 91.0% accuracy on ScreenSpot and ScreenSpot-v2, outperforming several RL-based models such as UIShift and SE-GUI. Our 3B-parameter model based on Qwen2.5-VL-3B surpasses its original performance by +26.6%, demonstrating the effectiveness of our reward design and pruning strategy under low-resource conditions. Furthermore, the proposed FDC-Pruning strategy achieves a 1.18× training speedup and a +5.9% accuracy improvement over standard GRPO, and expanding the exploration space to 4× yields an additional +10.5% gain, confirming both the scalability and the training efficiency of our approach. These findings highlight that combining EDDR with FDC-Pruning offers a practical path toward scalable and efficient RL-based GUI grounding, even in low-resource settings.

FDC-Ground: Improving GRPO for GUI Grounding via Exponential Rewards and Fact-Aligned Pruning

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information