Pillar-0: A New Frontier for Radiology Foundation Models
Kumar Krishna Agrawal, Longchao Liu, Long Lian, Michael Nercessian, Natalia Harguindeguy, Yufu Wu, Peter Mikhael, Gigin Lin, Lecia V. Sequist, Florian Fintelmann, Trevor Darrell, Yutong Bai, Maggie Chung, Adam Yala
2025-11-25
Summary
This paper introduces Pillar-0, a new artificial intelligence model designed to help doctors read medical images like CT scans and MRIs, and RATE, a system to accurately label findings in those images.
What's the problem?
Radiologists are facing a huge increase in the number of scans they need to review, but there aren't enough radiologists to keep up. Existing AI models for medical imaging aren't great because they treat 3D scans as a series of flat pictures, lose important details in the images, and aren't tested in a way that reflects how doctors actually use them in real-world situations.
What's the solution?
The researchers created Pillar-0 by training a model on a massive collection of over 155,000 scans – CT scans of the abdomen, chest, and head, as well as breast MRIs. They also developed RATE, which uses powerful language models to automatically and very accurately identify 366 different things radiologists look for in scans. They then tested Pillar-0 on various datasets, comparing its performance to other leading AI models like those from Google, Microsoft, Alibaba, and Stanford, and it consistently performed better.
Why it matters?
Pillar-0 and RATE represent a significant step forward in medical imaging AI. This new model is more accurate than existing options, can handle complex tasks like predicting lung cancer risk, and can even work well with limited data. By providing an open and reliable foundation, this work could enable new AI applications that were previously impossible, ultimately helping radiologists provide faster and more accurate diagnoses.
Abstract
Radiology plays an integral role in modern medicine, yet rising imaging volumes have far outpaced workforce growth. Foundation models offer a path toward assisting with the full spectrum of radiology tasks, but existing medical models remain limited: they process volumetric CT and MRI as low-fidelity 2D slices, discard critical grayscale contrast information, and lack evaluation frameworks that reflect real clinical practice. We introduce Pillar-0, a radiology foundation model pretrained on 42,990 abdomen-pelvis CTs, 86,411 chest CTs, 14,348 head CTs, and 11,543 breast MRIs from a large academic center, together with RATE, a scalable framework that extracts structured labels for 366 radiologic findings with near-perfect accuracy using LLMs. Across internal test sets of 14,230 abdomen-pelvis CTs, 10,646 chest CTs, 4,906 head CTs, and 1,585 breast MRIs, Pillar-0 establishes a new performance frontier, achieving mean AUROCs of 86.4, 88.0, 90.1, and 82.9, outperforming MedGemma (Google), MedImageInsight (Microsoft), Lingshu (Alibaba), and Merlin (Stanford) by 7.8-15.8 AUROC points and ranking best in 87.2\% (319/366) tasks. Pillar-0 similarly outperforms all baselines in an external validation on the Stanford Abdominal CT dataset, including Merlin (82.2 vs 80.6 AUROC). Pillar-0 extends to tasks beyond its pretraining, such as long-horizon lung cancer risk prediction, where it improves upon the state-of-the-art Sybil by 3.0 C-index points on NLST, and generalizes with gains of 5.9 (MGH) and 1.9 (CGMH). In brain hemorrhage detection, Pillar-0 obtained a >95 AUROC when using only 1/20th of the data of the next most sample efficient baseline. Pillar-0 and RATE together provide an open, clinically rigorous foundation for building high-performance radiology systems, enabling applications that were previously infeasible due to computational, data, and evaluation constraints.