Lost in the Noise: How Reasoning Models Fail with Contextual Distractors
Seongyun Lee, Yongrae Jo, Minju Seo, Moontae Lee, Minjoon Seo
2026-01-13
Summary
This paper investigates how well current AI models, especially those that use information from outside sources, handle messy or distracting information. It shows that these models often struggle significantly when given irrelevant or misleading data, and proposes a new way to train them to be more reliable.
What's the problem?
AI systems are increasingly using external information to improve their reasoning and problem-solving abilities. However, the real world is full of noise – irrelevant details, incorrect information, and confusing data. Existing tests for these AI models don't accurately reflect this reality, making it seem like they're more robust than they actually are. The core issue is that when presented with distracting information, even the best AI models can experience a huge drop in performance, sometimes failing up to 80% of the time.
What's the solution?
The researchers created a new benchmark called NoisyBench, which includes 11 different tests designed to specifically measure how AI models handle various types of noise. They tested current models and found they consistently struggled. They then introduced a new training method called Rationale-Aware Reward (RARE), which encourages the AI to identify *why* certain information is helpful and ignore the noise. This RARE method proved much more effective at improving performance in noisy situations compared to other common training techniques.
Why it matters?
This research is important because it highlights a major weakness in current AI systems. As we rely more on AI for complex tasks, it's crucial that they can handle real-world data, which is rarely clean and perfect. Understanding how noise affects AI performance and developing methods like RARE to improve robustness is essential for building trustworthy and reliable AI agents that can function effectively in unpredictable environments.
Abstract
Recent advances in reasoning models and agentic AI systems have led to an increased reliance on diverse external information. However, this shift introduces input contexts that are inherently noisy, a reality that current sanitized benchmarks fail to capture. We introduce NoisyBench, a comprehensive benchmark that systematically evaluates model robustness across 11 datasets in RAG, reasoning, alignment, and tool-use tasks against diverse noise types, including random documents, irrelevant chat histories, and hard negative distractors. Our evaluation reveals a catastrophic performance drop of up to 80% in state-of-the-art models when faced with contextual distractors. Crucially, we find that agentic workflows often amplify these errors by over-trusting noisy tool outputs, and distractors can trigger emergent misalignment even without adversarial intent. We find that prompting, context engineering, SFT, and outcome-reward only RL fail to ensure robustness; in contrast, our proposed Rationale-Aware Reward (RARE) significantly strengthens resilience by incentivizing the identification of helpful information within noise. Finally, we uncover an inverse scaling trend where increased test-time computation leads to worse performance in noisy settings and demonstrate via attention visualization that models disproportionately focus on distractor tokens, providing vital insights for building the next generation of robust, reasoning-capable agents.