REST: Stress Testing Large Reasoning Models by Asking Multiple Problems at Once

Zhuoshi Pan, Qizhi Pei, Yu Li, Qiyao Sun, Zinan Tang, H. Vicky Zhao, Conghui He, Lijun Wu

2025-07-15

REST: Stress Testing Large Reasoning Models by Asking Multiple Problems
at Once

Summary

This paper talks about REST, a new way to test how well large reasoning AI models perform when asked to solve many problems at the same time instead of one by one.

What's the problem?

Most current AI tests only give one problem at a time, which doesn’t show how models handle real-world challenges where multiple questions or tasks happen together, causing models to struggle or make mistakes under stress.

What's the solution?

The researchers created REST to combine multiple questions into one prompt and ask models to answer all of them at once. This stress test reveals how models perform with higher mental load and shows differences between models that seem similar in simpler tests.

Why it matters?

This matters because REST helps discover the real strengths and weaknesses of AI reasoning models, making it possible to improve them for real-life scenarios where problems are more complex and happen simultaneously.

Abstract

REST, a stress-testing framework, evaluates large reasoning models under simultaneous multi-problem conditions, revealing performance differences and insights into model behavior under real-world reasoning demands.

View Paper