R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing

Tianyu Fu, Yi Ge, Yichen You, Enshu Liu, Zhihang Yuan, Guohao Dai, Shengen Yan, Huazhong Yang, Yu Wang

2025-05-29

R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large
Model Token Routing

Summary

This paper talks about R2R, a new method that helps AI systems work faster and smarter by deciding when to use a small, quick model and when to call in a bigger, more powerful model for tough reasoning tasks.

What's the problem?

The problem is that large language models are really good at solving complicated problems, but they use a lot of computer power and can be slow, while smaller models are faster but not as smart when it comes to tricky questions. Using only one type means you either waste resources or miss out on the best answers.

What's the solution?

To fix this, the researchers created a system that routes each part of a task to either a small or a large model depending on how hard it is. If the task is simple, the small model handles it. If it gets complicated, the system sends it to the large model, making the whole process more efficient without losing accuracy.

Why it matters?

This is important because it means we can build AI that is both fast and smart, saving energy and time while still getting great results, which is useful for everything from chatbots to research tools.

Abstract

Roads to Rome (R2R) selectively utilizes large language models for critical reasoning tasks to enhance efficiency and performance in lightweight models.

View Paper