AERIS: Argonne Earth Systems Model for Reliable and Skillful Predictions
Väinö Hatanpää, Eugene Ku, Jason Stock, Murali Emani, Sam Foreman, Chunyong Jung, Sandeep Madireddy, Tung Nguyen, Varuni Sastry, Ray A. O. Sinurat, Sam Wheeler, Huihuo Zheng, Troy Arcomano, Venkatram Vishwanath, Rao Kotamarthi
2025-09-18
Summary
This paper introduces a new machine learning model, AERIS, designed for improved weather forecasting and climate prediction. It focuses on making these predictions more accurate and reliable, especially when dealing with very detailed, high-resolution data.
What's the problem?
Current advanced weather forecasting methods, specifically those using diffusion-based machine learning, struggle to work efficiently when you try to make them handle really detailed, high-resolution data. Scaling these models up to handle that level of detail is difficult and often unstable, meaning they can crash or produce unreliable results. Essentially, the computational demands become too much.
What's the solution?
The researchers developed AERIS, a powerful model with up to 80 billion parameters, built using a special type of transformer architecture. They also created a technique called SWiPe which allows the model to be split up and run across many computer processors simultaneously without slowing things down due to communication overhead. This allows AERIS to process massive datasets and achieve high performance, sustaining over 10 ExaFLOPS on a supercomputer.
Why it matters?
This work is important because it demonstrates the potential of very large machine learning models for significantly improving weather and climate predictions. AERIS not only performs better than existing forecasting systems but also remains stable for longer prediction periods, potentially allowing for more accurate seasonal forecasts and a better understanding of long-term climate trends.
Abstract
Generative machine learning offers new opportunities to better understand complex Earth system dynamics. Recent diffusion-based methods address spectral biases and improve ensemble calibration in weather forecasting compared to deterministic methods, yet have so far proven difficult to scale stably at high resolutions. We introduce AERIS, a 1.3 to 80B parameter pixel-level Swin diffusion transformer to address this gap, and SWiPe, a generalizable technique that composes window parallelism with sequence and pipeline parallelism to shard window-based transformers without added communication cost or increased global batch size. On Aurora (10,080 nodes), AERIS sustains 10.21 ExaFLOPS (mixed precision) and a peak performance of 11.21 ExaFLOPS with 1 times 1 patch size on the 0.25{\deg} ERA5 dataset, achieving 95.5% weak scaling efficiency, and 81.6% strong scaling efficiency. AERIS outperforms the IFS ENS and remains stable on seasonal scales to 90 days, highlighting the potential of billion-parameter diffusion models for weather and climate prediction.