ERGO: Entropy-guided Resetting for Generation Optimization in Multi-turn Language Models

Haziq Mohammad Khalid, Athikash Jeyaganthan, Timothy Do, Yicheng Fu, Sean O'Brien, Vasu Sharma, Kevin Zhu

2025-10-20

ERGO: Entropy-guided Resetting for Generation Optimization in Multi-turn Language Models

Summary

This paper addresses a problem with how large language models, or LLMs, perform in extended conversations where information is given bit by bit, rather than all at once.

What's the problem?

LLMs get noticeably worse at responding accurately as a conversation goes on and new information is added gradually. Think about trying to give an AI instructions one step at a time – it starts to struggle to keep everything straight. This is a big issue because most real-world interactions with AI are actually these kinds of ongoing conversations, not just single questions and answers.

What's the solution?

The researchers noticed that when an LLM is becoming confused, its internal 'guesswork' about what word comes next gets more spread out, meaning it's less certain. They created a system called ERGO that constantly monitors this uncertainty. When ERGO detects a sudden jump in uncertainty, it essentially 'resets' the conversation's context, helping the LLM refocus. Instead of trying to eliminate uncertainty, ERGO uses it as a signal to improve the conversation.

Why it matters?

This work is important because it significantly improves both the accuracy and consistency of LLMs in multi-turn conversations. By using ERGO, the models performed much better, were more capable overall, and gave more reliable responses. This means we can build AI systems that are better at handling complex, ongoing interactions, making them more useful in everyday life.

Abstract

Large Language Models (LLMs) suffer significant performance degradation in multi-turn conversations when information is presented incrementally. Given that multi-turn conversations characterize everyday interactions with LLMs, this degradation poses a severe challenge to real world usability. We hypothesize that abrupt increases in model uncertainty signal misalignment in multi-turn LLM interactions, and we exploit this insight to dynamically realign conversational context. We introduce ERGO (Entropy-guided Resetting for Generation Optimization), which continuously quantifies internal uncertainty via Shannon entropy over next token distributions and triggers adaptive prompt consolidation when a sharp spike in entropy is detected. By treating uncertainty as a first class signal rather than a nuisance to eliminate, ERGO embraces variability in language and modeling, representing and responding to uncertainty. In multi-turn tasks with incrementally revealed instructions, ERGO yields a 56.6% average performance gain over standard baselines, increases aptitude (peak performance capability) by 24.7%, and decreases unreliability (variability in performance) by 35.3%, demonstrating that uncertainty aware interventions can improve both accuracy and reliability in conversational AI.

View Paper