Combee: Scaling Prompt Learning for Self-Improving Language Model Agents

Hanchen Li, Runyuan He, Qizheng Zhang, Changxiu Ji, Qiuyang Mang, Xiaokun Chen, Lakshya A Agrawal, Wei-Liang Liao, Eric Yang, Alvin Cheung, James Zou, Kunle Olukotun, Ion Stoica, Joseph E. Gonzalez

2026-04-09

Combee: Scaling Prompt Learning for Self-Improving Language Model Agents

Summary

This paper introduces Combee, a new system designed to help AI agents learn and improve more quickly by analyzing their past actions. It focuses on making 'prompt learning' – where the AI adjusts its instructions based on experience – much faster and more efficient, especially when dealing with many agents working at the same time.

What's the problem?

Currently, AI agents can learn from their mistakes and successes to get better at tasks, but this learning process slows down significantly when you try to have many agents learning simultaneously. Existing methods for prompt learning don't scale well; adding more agents doesn't lead to a proportional increase in learning speed and can even reduce the quality of the learning. It's like trying to teach a huge class – it becomes harder to give individual attention and ensure everyone understands.

What's the solution?

Combee solves this problem by allowing prompt learning to happen in parallel, meaning many agents can learn at the same time without slowing each other down. It does this through a few key techniques: it quickly scans through the data from all the agents, shuffles that data in a smart way to prevent biases, and dynamically adjusts how much data each learning step uses to balance speed and accuracy. Think of it as a well-organized study group where everyone contributes and learns efficiently.

Why it matters?

This research is important because it paves the way for building more powerful and adaptable AI systems. As we move towards using more and more AI agents to solve complex problems, the ability to quickly and efficiently learn from their experiences becomes crucial. Combee’s speedup and maintained accuracy mean we can train AI agents faster and more effectively, ultimately leading to better performance on a variety of tasks.

Abstract

Recent advances in prompt learning allow large language model agents to acquire task-relevant knowledge from inference-time context without parameter changes. For example, existing methods (like ACE or GEPA) can learn system prompts to improve accuracy based on previous agent runs. However, these methods primarily focus on single-agent or low-parallelism settings. This fundamentally limits their ability to efficiently learn from a large set of collected agentic traces. It would be efficient and beneficial to run prompt learning in parallel to accommodate the growing trend of learning from many agentic traces or parallel agent executions. Yet without a principled strategy for scaling, current methods suffer from quality degradation with high parallelism. To improve both the efficiency and quality of prompt learning, we propose Combee, a novel framework to scale parallel prompt learning for self-improving agents. Combee speeds up learning and enables running many agents in parallel while learning from their aggregate traces without quality degradation. To achieve this, Combee leverages parallel scans and employs an augmented shuffle mechanism; Combee also introduces a dynamic batch size controller to balance quality and delay. Evaluations on AppWorld, Terminal-Bench, Formula, and FiNER demonstrate that Combee achieves up to 17x speedup over previous methods with comparable or better accuracy and equivalent cost.

View Paper