Virtuous Machines: Towards Artificial General Science

Gabrielle Wehr, Reuben Rideaux, Amaya J. Fox, David R. Lightfoot, Jason Tangen, Jason B. Mattingley, Shane E. Ehrhardt

2025-08-21

Virtuous Machines: Towards Artificial General Science

Summary

This paper explores how artificial intelligence can be used for general scientific discovery, going beyond just specific research tasks. It demonstrates an AI system that can independently manage the entire scientific process, from coming up with ideas to writing research papers.

What's the problem?

While AI is good at specific scientific tasks, it's usually limited to those tasks and needs humans to guide it. Researchers are overwhelmed by the sheer amount of scientific information and the increasing specialization in different fields, making it hard to connect ideas across disciplines and create big, unifying theories. This calls for AI that can do science more generally.

What's the solution?

The researchers created an AI system that doesn't need to be specialized for one area. This system was able to come up with hypotheses, design and run three actual psychological studies, collect data from nearly 300 people, write code to analyze the data, and even produce research papers. It was involved in over eight hours of continuous coding to develop its analysis tools.

Why it matters?

This is significant because it shows AI can perform real, complex scientific research with the same level of logical reasoning and careful methods as experienced scientists, even though it might miss some subtle interpretations. It's a big step towards AI that can physically test ideas in the real world, speeding up scientific discovery by exploring research areas that humans might not have the time or resources to explore.

Abstract

Artificial intelligence systems are transforming scientific discovery by accelerating specific research tasks, from protein structure prediction to materials design, yet remain confined to narrow domains requiring substantial human oversight. The exponential growth of scientific literature and increasing domain specialisation constrain researchers' capacity to synthesise knowledge across disciplines and develop unifying theories, motivating exploration of more general-purpose AI systems for science. Here we show that a domain-agnostic, agentic AI system can independently navigate the scientific workflow - from hypothesis generation through data collection to manuscript preparation. The system autonomously designed and executed three psychological studies on visual working memory, mental rotation, and imagery vividness, executed one new online data collection with 288 participants, developed analysis pipelines through 8-hour+ continuous coding sessions, and produced completed manuscripts. The results demonstrate the capability of AI scientific discovery pipelines to conduct non-trivial research with theoretical reasoning and methodological rigour comparable to experienced researchers, though with limitations in conceptual nuance and theoretical interpretation. This is a step toward embodied AI that can test hypotheses through real-world experiments, accelerating discovery by autonomously exploring regions of scientific space that human cognitive and resource constraints might otherwise leave unexplored. It raises important questions about the nature of scientific understanding and the attribution of scientific credit.

View Paper