UI-Voyager: A Self-Evolving GUI Agent Learning via Failed Experience
Zichuan Lin, Feiyu Liu, Yijun Yang, Jiafei Lyu, Yiming Gao, Yicheng Liu, Zhicong Lu, Yangbin Yu, Mingyu Yang, Junyou Li, Deheng Ye, Jie Jiang
2026-03-26
Summary
This paper introduces UI-Voyager, a new AI system designed to automate tasks on smartphone apps, like a virtual assistant that can use apps for you.
What's the problem?
Current AI systems struggle to learn how to complete complex tasks on phones because it's hard for them to figure out *why* they failed when they don't get clear feedback, and learning from mistakes takes a lot of trial and error. It's like trying to learn a new game without knowing which moves were wrong and why. Existing methods aren't efficient at learning from these failed attempts, especially when the task requires many steps.
What's the solution?
UI-Voyager tackles this in two main steps. First, it uses a method called 'Rejection Fine-Tuning' where the AI continuously improves both its understanding of the task and its ability to perform it, all without needing a human to tell it what to do. Second, it uses 'Group Relative Self-Distillation' to pinpoint exactly where things went wrong during an attempt and then uses successful attempts to 'teach' the failed ones, creating more detailed instructions for itself. Essentially, it learns from its successes to correct its mistakes.
Why it matters?
This research is important because it allows for the creation of AI agents that can automate tasks on phones much more effectively and efficiently than before, without requiring a huge amount of manually labeled data. The system even performs better than a human on certain tasks, representing a big step towards truly helpful and automated mobile assistants.
Abstract
Autonomous mobile GUI agents have attracted increasing attention along with the advancement of Multimodal Large Language Models (MLLMs). However, existing methods still suffer from inefficient learning from failed trajectories and ambiguous credit assignment under sparse rewards for long-horizon GUI tasks. To that end, we propose UI-Voyager, a novel two-stage self-evolving mobile GUI agent. In the first stage, we employ Rejection Fine-Tuning (RFT), which enables the continuous co-evolution of data and models in a fully autonomous loop. The second stage introduces Group Relative Self-Distillation (GRSD), which identifies critical fork points in group rollouts and constructs dense step-level supervision from successful trajectories to correct failed ones. Extensive experiments on AndroidWorld show that our 4B model achieves an 81.0% Pass@1 success rate, outperforming numerous recent baselines and exceeding human-level performance. Ablation and case studies further verify the effectiveness of GRSD. Our method represents a significant leap toward efficient, self-evolving, and high-performance mobile GUI automation without expensive manual data annotation.