HiconAgent: History Context-aware Policy Optimization for GUI Agents

Xurui Zhou, Gongwei Chen, Yuquan Xie, Zaijing Li, Kaiwen Zhou, Shuai Wang, Shuo Yang, Zhuotao Tian, Rui Shao

2025-12-02

HiconAgent: History Context-aware Policy Optimization for GUI Agents

Summary

This paper introduces a new way to help computer agents interact with graphical user interfaces, like apps on your phone or computer, by allowing them to remember and use past actions more effectively.

What's the problem?

When you're trying to get a computer to automate tasks on a screen, it needs to remember what it's already done to make smart decisions. Simply feeding it *everything* it's ever done takes a lot of computing power and can confuse the agent with unimportant details. It's like trying to remember every single step you took today when you just need to recall what you did five minutes ago.

What's the solution?

The researchers created an agent called HiconAgent that uses a technique called History Context-aware Policy Optimization, or HCPO. HCPO has two main parts: first, it shows the agent different amounts of past information during training, focusing on what's most relevant. Second, it compresses the history, keeping important actions in mind while discarding less useful observations. This compression is done in a smart way to ensure the agent still uses history consistently, but more efficiently.

Why it matters?

This work is important because it allows for smaller, faster AI agents that can still perform complex tasks on computers. HiconAgent is significantly more efficient than larger agents while achieving better or comparable results on standard tests, meaning we can build more practical and accessible automation tools.

Abstract

Graphical User Interface (GUI) agents require effective use of historical context to perform sequential navigation tasks. While incorporating past actions and observations can improve decision making, naive use of full history leads to excessive computational overhead and distraction from irrelevant information. To address this, we introduce HiconAgent, a GUI agent trained with History Context-aware Policy Optimization (HCPO) for efficient and effective utilization of historical information. HCPO optimizes history usage in both sampling and policy updates through two complementary components: (1) Dynamic Context Sampling (DCS) presents the agent with variable length histories during sampling, enabling adaptive use of the most relevant context; (2) Anchor-guided History Compression (AHC) refines the policy update phase with a dual branch strategy where the compressed branch removes history observations while keeping history actions as information flow anchors. The compressed and uncompressed branches are coupled through a history-enhanced alignment loss to enforce consistent history usage while maintaining efficiency. Experiments on mainstream GUI navigation benchmarks demonstrate strong performance. Despite being smaller, HiconAgent-3B outperforms GUI-R1-7B by +8.46 percent grounding accuracy and +11.32 percent step success rate on GUI-Odyssey, while achieving comparable results on AndroidControl and AITW with up to 2.47x computational speedup and 60 percent FLOPs reduction.

View Paper