Jr. AI Scientist and Its Risk Report: Autonomous Scientific Exploration from a Baseline Paper

Atsuyuki Miyai, Mashiro Toyooka, Takashi Otonari, Zaiying Zhao, Kiyoharu Aizawa

2025-11-06

Jr. AI Scientist and Its Risk Report: Autonomous Scientific Exploration from a Baseline Paper

Summary

This paper introduces Jr. AI Scientist, a new AI system designed to act like a beginning researcher, and explores its abilities and potential problems.

What's the problem?

Currently, it's hard to know how reliable and trustworthy AI systems are that try to do science on their own. Previous AI attempts at scientific research either weren't fully automated, meaning they still needed a lot of human help, or they only worked on very small, simple projects. There's a need to understand what these AI 'scientists' can actually do and what risks they pose to the way science is normally done.

What's the solution?

The researchers built Jr. AI Scientist to mimic how a student researcher works. It starts with an existing research paper, finds weaknesses in it, comes up with new ideas to improve upon it, runs experiments to test those ideas, and then writes up a new paper detailing the results. Importantly, it uses modern 'coding agents' to handle complex computer code, allowing it to tackle more realistic scientific problems. They then tested the papers Jr. AI Scientist created using AI reviewers, the researchers themselves, and by submitting them to a competition for AI-driven science.

Why it matters?

This work is important because it helps us understand the current state of AI in science. While Jr. AI Scientist performs better than previous AI systems, the researchers also found limitations and potential dangers, like the possibility of flawed research being produced. By openly sharing these findings and risks, they hope to guide future development of AI scientists and ensure they are used responsibly to advance science in a trustworthy way.

Abstract

Understanding the current capabilities and risks of AI Scientist systems is essential for ensuring trustworthy and sustainable AI-driven scientific progress while preserving the integrity of the academic ecosystem. To this end, we develop Jr. AI Scientist, a state-of-the-art autonomous AI scientist system that mimics the core research workflow of a novice student researcher: Given the baseline paper from the human mentor, it analyzes its limitations, formulates novel hypotheses for improvement, validates them through rigorous experimentation, and writes a paper with the results. Unlike previous approaches that assume full automation or operate on small-scale code, Jr. AI Scientist follows a well-defined research workflow and leverages modern coding agents to handle complex, multi-file implementations, leading to scientifically valuable contributions. For evaluation, we conducted automated assessments using AI Reviewers, author-led evaluations, and submissions to Agents4Science, a venue dedicated to AI-driven scientific contributions. The findings demonstrate that Jr. AI Scientist generates papers receiving higher review scores than existing fully automated systems. Nevertheless, we identify important limitations from both the author evaluation and the Agents4Science reviews, indicating the potential risks of directly applying current AI Scientist systems and key challenges for future research. Finally, we comprehensively report various risks identified during development. We hope these insights will deepen understanding of current progress and risks in AI Scientist development.

View Paper