Towards a Medical AI Scientist

Hongtao Wu, Boyun Zheng, Dingjie Song, Yu Jiang, Jianfeng Gao, Lei Xing, Lichao Sun, Yixuan Yuan

2026-03-31

Summary

This paper introduces a new AI system, called Medical AI Scientist, designed to independently conduct medical research, from forming ideas to writing up the results. It's a step towards using artificial intelligence to speed up discoveries in healthcare.

What's the problem?

Current AI systems that try to do scientific research aren't very good at handling the specific complexities of medical research. Medical research needs to be based on very solid evidence, and uses specialized types of data like medical images and patient records. Existing AI often struggles to understand and work with these things effectively, and it can be hard to follow *why* the AI came up with a particular research idea.

What's the solution?

The researchers built Medical AI Scientist, which is specifically designed for medical research. It works by carefully analyzing existing medical literature, and then uses a process where doctors and engineers work together to turn that information into testable research ideas. The AI also helps write up the research findings in a way that follows standard medical writing practices and ethical guidelines. It can operate in different modes, from simply repeating existing research to exploring completely new ideas, with varying levels of AI independence. The system was tested by both AI judges and human experts.

Why it matters?

This work shows that AI can potentially automate a lot of the work involved in medical research, which could lead to faster breakthroughs in treating diseases and improving healthcare. The system generates higher quality research ideas than current general AI models, and the manuscripts it produces are approaching the quality of those written by human researchers at major conferences.

Abstract

Autonomous systems that generate scientific hypotheses, conduct experiments, and draft manuscripts have recently emerged as a promising paradigm for accelerating discovery. However, existing AI Scientists remain largely domain-agnostic, limiting their applicability to clinical medicine, where research is required to be grounded in medical evidence with specialized data modalities. In this work, we introduce Medical AI Scientist, the first autonomous research framework tailored to clinical autonomous research. It enables clinically grounded ideation by transforming extensively surveyed literature into actionable evidence through clinician-engineer co-reasoning mechanism, which improves the traceability of generated research ideas. It further facilitates evidence-grounded manuscript drafting guided by structured medical compositional conventions and ethical policies. The framework operates under 3 research modes, namely paper-based reproduction, literature-inspired innovation, and task-driven exploration, each corresponding to a distinct level of automated scientific inquiry with progressively increasing autonomy. Comprehensive evaluations by both large language models and human experts demonstrate that the ideas generated by the Medical AI Scientist are of substantially higher quality than those produced by commercial LLMs across 171 cases, 19 clinical tasks, and 6 data modalities. Meanwhile, our system achieves strong alignment between the proposed method and its implementation, while also demonstrating significantly higher success rates in executable experiments. Double-blind evaluations by human experts and the Stanford Agentic Reviewer suggest that the generated manuscripts approach MICCAI-level quality, while consistently surpassing those from ISBI and BIBM. The proposed Medical AI Scientist highlights the potential of leveraging AI for autonomous scientific discovery in healthcare.

View Paper