GUI-G^2: Gaussian Reward Modeling for GUI Grounding
Fei Tang, Zhangxuan Gu, Zhengxi Lu, Xuyang Liu, Shuheng Shen, Changhua Meng, Wen Wang, Wenqi Zhang, Yongliang Shen, Weiming Lu, Jun Xiao, Yueting Zhuang
2025-07-22
Summary
This paper talks about GUI-G^2, a new way to help AI agents interact with computer interfaces by using a reward system based on Gaussian distributions instead of just simple right-or-wrong feedback.
What's the problem?
The problem is that previous methods gave very limited feedback for AI when it tried to click or interact with buttons and icons, treating each attempt as either a hit or a miss, which made learning slower and less accurate.
What's the solution?
The authors designed GUI-G^2 to view interface elements as smooth, continuous areas shaped like Gaussian curves, giving the AI richer and more detailed feedback based on how close it is to the target. They also added a system to adjust for different sizes of buttons or icons, making the feedback smarter and more natural.
Why it matters?
This matters because it helps AI learn to interact with user interfaces more efficiently and accurately, allowing smarter automation in tasks like controlling software or exploring digital environments with much better precision.
Abstract
A new reward framework, GUI-G$^2$, models GUI elements as continuous Gaussian distributions to improve autonomous interaction through dense gradient signals, outperforming existing methods in spatial reasoning tasks.