Dolphin: Closed-loop Open-ended Auto-research through Thinking, Practice, and Feedback
Jiakang Yuan, Xiangchao Yan, Botian Shi, Tao Chen, Wanli Ouyang, Bo Zhang, Lei Bai, Yu Qiao, Bowen Zhou
2025-01-08

Summary
This paper talks about Dolphin, a new AI system that can do scientific research on its own, from coming up with ideas to running experiments and learning from the results.
What's the problem?
Scientific research is time-consuming and complex. Even with AI helping in some areas, there hasn't been a system that can handle the entire research process from start to finish without human intervention.
What's the solution?
The researchers created Dolphin, an AI system that works in a loop. It starts by generating new research ideas based on existing papers. Then, it writes and debugs code to test these ideas. Finally, it analyzes the results and uses what it learned to come up with even better ideas for the next round. This process repeats, allowing Dolphin to continuously improve its research.
Why it matters?
This matters because it could revolutionize how scientific research is done. Dolphin can work much faster than humans and doesn't need breaks, potentially speeding up scientific discoveries. It's already as good as humans at some tasks, like classifying 2D images and 3D points. In the future, systems like Dolphin could help solve complex problems in science and technology more quickly, leading to faster progress in many fields.
Abstract
The scientific research paradigm is undergoing a profound transformation owing to the development of Artificial Intelligence (AI). Recent works demonstrate that various AI-assisted research methods can largely improve research efficiency by improving data analysis, accelerating computation, and fostering novel idea generation. To further move towards the ultimate goal (i.e., automatic scientific research), in this paper, we propose Dolphin, the first closed-loop open-ended auto-research framework to further build the entire process of human scientific research. Dolphin can generate research ideas, perform experiments, and get feedback from experimental results to generate higher-quality ideas. More specifically, Dolphin first generates novel ideas based on relevant papers which are ranked by the topic and task attributes. Then, the codes are automatically generated and debugged with the exception-traceback-guided local code structure. Finally, Dolphin automatically analyzes the results of each idea and feeds the results back to the next round of idea generation. Experiments are conducted on the benchmark datasets of different topics and results show that Dolphin can generate novel ideas continuously and complete the experiment in a loop. We highlight that Dolphin can automatically propose methods that are comparable to the state-of-the-art in some tasks such as 2D image classification and 3D point classification.