A Simple "Try Again" Can Elicit Multi-Turn LLM Reasoning

Licheng Liu, Zihan Wang, Linjie Li, Chenwei Xu, Yiping Lu, Han Liu, Avirup Sil, Manling Li

2025-07-22

A Simple "Try Again" Can Elicit Multi-Turn LLM Reasoning

Summary

This paper talks about a simple method that helps large language models improve their ability to think through problems step-by-step during conversations by training them to 'try again' when their first answer is not correct.

What's the problem?

The problem is that AI models often struggle to provide accurate and complete answers during multi-step reasoning, especially when conversations require several turns to reach the correct conclusion.

What's the solution?

The authors trained the models using multi-turn reinforcement learning where the AI gets feedback on each attempt and is encouraged to retry and improve its answer over multiple turns, which leads to better thinking and more accurate final results.

Why it matters?

This matters because it makes AI assistants smarter and more reliable when handling complex questions or tasks that need careful, step-by-step reasoning, improving their usefulness in real conversations.

Abstract

Training large reasoning models with multi-turn reinforcement learning using unary feedback improves both single-turn performance and multi-turn reasoning accuracy.

View Paper