A Self-Evolving Framework for Efficient Terminal Agents via Observational Context Compression

Jincheng Ren, Siwei Wu, Yizhi Li, Kang Zhu, Shu Xu, Boyu Feng, Ruibin Yuan, Wei Zhang, Riza Batista-Navarro, Jian Yang, Chenghua Lin

2026-04-23

A Self-Evolving Framework for Efficient Terminal Agents via Observational Context Compression

Summary

This paper addresses the challenge of making AI agents that can handle complex, long-term tasks more efficient. These agents learn from their experiences, but remembering everything they've done takes up a lot of computing power and money, especially as the tasks get longer.

What's the problem?

When AI agents work on tasks over a long period, they need to remember past interactions to make good decisions. However, storing all this information – every step and the feedback received – quickly becomes incredibly expensive in terms of processing power and the number of 'tokens' (units of text) used. Simply compressing this information doesn't work well because different tasks require different types of information to be remembered, making a one-size-fits-all approach ineffective.

What's the solution?

The researchers developed a system called TACO, which stands for Terminal Agent Compression. TACO automatically learns the best way to compress the information an agent needs to remember, specifically for the task it's working on. It's designed to be easily added to existing AI agents without requiring major changes. TACO essentially figures out what details are important and what can be safely discarded, improving efficiency without sacrificing performance.

Why it matters?

This work is important because it allows AI agents to tackle more complex and lengthy tasks without becoming prohibitively expensive. By making agents more efficient, TACO opens the door to more powerful AI systems that can solve real-world problems that require long-term planning and reasoning, like coding, software development, and problem solving in various environments.

Abstract

As model capabilities advance, research has increasingly shifted toward long-horizon, multi-turn terminal-centric agentic tasks, where raw environment feedback is often preserved in the interaction history to support future decisions. However, repeatedly retaining such feedback introduces substantial redundancy and causes cumulative token cost to grow quadratically with the number of steps, hindering long-horizon reasoning. Although observation compression can mitigate this issue, the heterogeneity of terminal environments makes heuristic-based or fixed-prompt methods difficult to generalize. We propose TACO, a plug-and-play, self-evolving Terminal Agent Compression framework that automatically discovers and refines compression rules from interaction trajectories for existing terminal agents. Experiments on TerminalBench (TB 1.0 and TB 2.0) and four additional terminal-related benchmarks (i.e., SWE-Bench Lite, CompileBench, DevEval, and CRUST-Bench) show that TACO consistently improves performance across mainstream agent frameworks and strong backbone models. With MiniMax-2.5, it improves performance on most benchmarks while reducing token overhead by around 10%. On TerminalBench, it brings consistent gains of 1%-4% across strong agentic models, and further improves accuracy by around 2%-3% under the same token budget. These results demonstrate the effectiveness and generalization of self-evolving, task-aware compression for terminal agents.

View Paper