ParallelMuse: Agentic Parallel Thinking for Deep Information Seeking
Baixuan Li, Dingchu Zhang, Jialong Wu, Wenbiao Yin, Zhengwei Tao, Yida Zhao, Liwen Zhang, Haiyang Shen, Runnan Fang, Pengjun Xie, Jingren Zhou, Yong Jiang
2025-10-29
Summary
This paper introduces a new method, ParallelMuse, to improve how AI agents explore information and solve problems. It builds on the idea of 'parallel thinking,' where an AI considers multiple approaches at once, but addresses limitations in existing methods.
What's the problem?
When AI agents try to solve complex problems by exploring different ideas in parallel, they often run into two main issues. First, it's inefficient because the AI frequently starts its reasoning from scratch for each new idea. Second, it's hard for the AI to keep track of and combine long, complex lines of reasoning because the AI can only consider a limited amount of information at a time, losing important context.
What's the solution?
ParallelMuse tackles these problems in two steps. First, it breaks down the AI's thought process into smaller, functional parts and reuses successful reasoning paths instead of always starting over. It also smartly branches out from these paths when needed. Second, it compresses the important parts of the reasoning process, removing redundancy, so the AI can synthesize a final answer that considers the full picture without being limited by context size.
Why it matters?
This research is important because it significantly boosts the performance of AI agents – showing improvements of up to 62% – while also making them more efficient by reducing the amount of processing power they need. This means AI can solve harder problems more effectively and with less computational cost, bringing us closer to more capable and practical AI systems.
Abstract
Parallel thinking expands exploration breadth, complementing the deep exploration of information-seeking (IS) agents to further enhance problem-solving capability. However, conventional parallel thinking faces two key challenges in this setting: inefficiency from repeatedly rolling out from scratch, and difficulty in integrating long-horizon reasoning trajectories during answer generation, as limited context capacity prevents full consideration of the reasoning process. To address these issues, we propose ParallelMuse, a two-stage paradigm designed for deep IS agents. The first stage, Functionality-Specified Partial Rollout, partitions generated sequences into functional regions and performs uncertainty-guided path reuse and branching to enhance exploration efficiency. The second stage, Compressed Reasoning Aggregation, exploits reasoning redundancy to losslessly compress information relevant to answer derivation and synthesize a coherent final answer. Experiments across multiple open-source agents and benchmarks demonstrate up to 62% performance improvement with a 10--30% reduction in exploratory token consumption.