AdaR1: From Long-CoT to Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization

Haotian Luo, Haiying He, Yibo Wang, Jinluan Yang, Rui Liu, Naiqiang Tan, Xiaochun Cao, Dacheng Tao, Li Shen

2025-05-02

AdaR1: From Long-CoT to Hybrid-CoT via Bi-Level Adaptive Reasoning
Optimization

Summary

This paper talks about AdaR1, a new AI system that mixes two different ways of thinking through problems—both long and short step-by-step reasoning—to solve math questions more efficiently.

What's the problem?

AI models that solve math problems using detailed step-by-step explanations can be slow and expensive to run, while faster models might not be as accurate or thorough.

What's the solution?

The researchers created a two-step process that teaches the AI to choose when to use long explanations and when to use short ones, using a special training method that helps it balance speed and accuracy.

Why it matters?

This matters because it makes AI better at solving math problems quickly without losing quality, which is helpful for students, teachers, and anyone who needs fast, reliable answers.

Abstract

A two-stage adaptive reasoning framework combining long and short chain-of-thought models, trained with bi-level preference learning, reduces inference costs while maintaining performance across mathematical datasets.

View Paper