QueryBandits for Hallucination Mitigation: Exploiting Semantic Features for No-Regret Rewriting
Nicole Cho, William Watson, Alec Koppel, Sumitra Ganesh, Manuela Veloso
2025-08-27
Summary
This paper addresses the problem of Large Language Models (LLMs) making things up, also known as 'hallucinations'. It proposes a new method to reduce these hallucinations *before* the LLM even generates an answer, by cleverly rewriting the questions asked of it.
What's the problem?
As LLMs get better at complex reasoning, they also become more prone to hallucinating – confidently stating incorrect or nonsensical information. Current approaches mostly try to *detect* and filter out these hallucinations after they’ve been created. This paper argues that it’s better to prevent them in the first place by changing the way questions are asked, but figuring out *how* to change the questions is difficult because what works for one question might not work for another.
What's the solution?
The researchers developed a system called QueryBandits. It works by learning which types of question rewrites are most likely to reduce hallucinations. It analyzes different aspects of a question – things like the specific words used, the sentence structure, and the overall complexity – and then tries different rewrites, like paraphrasing or expanding on the question. The system uses a 'bandit' algorithm, which is like a smart trial-and-error process, to figure out which rewrites work best for different kinds of questions. It doesn't require retraining the LLM itself, it just adjusts the input.
Why it matters?
This work is important because it shows that proactively rewriting questions can significantly reduce hallucinations in LLMs, much more effectively than simply asking the model to 'paraphrase' or 'expand' the question. It also demonstrates that there isn't a single 'best' way to rewrite questions; the optimal approach depends on the specific question being asked. This means we can build systems that adapt to different queries and consistently get more reliable answers from LLMs without needing to constantly update the model itself.
Abstract
Advanced reasoning capabilities in Large Language Models (LLMs) have caused higher hallucination prevalence; yet most mitigation work focuses on after-the-fact filtering rather than shaping the queries that trigger them. We introduce QueryBandits, a bandit framework that designs rewrite strategies to maximize a reward model, that encapsulates hallucination propensity based upon the sensitivities of 17 linguistic features of the input query-and therefore, proactively steer LLMs away from generating hallucinations. Across 13 diverse QA benchmarks and 1,050 lexically perturbed queries per dataset, our top contextual QueryBandit (Thompson Sampling) achieves an 87.5% win rate over a no-rewrite baseline and also outperforms zero-shot static prompting ("paraphrase" or "expand") by 42.6% and 60.3% respectively. Therefore, we empirically substantiate the effectiveness of QueryBandits in mitigating hallucination via the intervention that takes the form of a query rewrite. Interestingly, certain static prompting strategies, which constitute a considerable number of current query rewriting literature, have a higher cumulative regret than the no-rewrite baseline, signifying that static rewrites can worsen hallucination. Moreover, we discover that the converged per-arm regression feature weight vectors substantiate that there is no single rewrite strategy optimal for all queries. In this context, guided rewriting via exploiting semantic features with QueryBandits can induce significant shifts in output behavior through forward-pass mechanisms, bypassing the need for retraining or gradient-based adaptation.