Thinkless: LLM Learns When to Think
Gongfan Fang, Xinyin Ma, Xinchao Wang
2025-05-20

Summary
This paper talks about Thinkless, a method that helps large language models decide when they should think a lot or just a little, depending on how hard the problem is.
What's the problem?
The problem is that language models often use the same amount of effort for every question, which can waste computer resources and slow things down, especially when some questions are much easier than others.
What's the solution?
To fix this, the researchers introduced special control tokens that guide the model to use either a quick answer or a more detailed explanation, depending on what the situation needs. This helps the model save time and energy on easy tasks while still being thorough on harder ones.
Why it matters?
This matters because it makes AI systems faster and more efficient, allowing them to handle more questions without using unnecessary computing power, which is important as these models are used more in real-world applications.
Abstract
Thinkless enables LLMs to adaptively choose between short and long reasoning by using control tokens, reducing computational inefficiencies on benchmarks.