Intern-S1: A Scientific Multimodal Foundation Model

Lei Bai, Zhongrui Cai, Maosong Cao, Weihan Cao, Chiyu Chen, Haojiong Chen, Kai Chen, Pengcheng Chen, Ying Chen, Yongkang Chen, Yu Cheng, Yu Cheng, Pei Chu, Tao Chu, Erfei Cui, Ganqu Cui, Long Cui, Ziyun Cui, Nianchen Deng, Ning Ding, Nanqin Dong, Peijie Dong

2025-08-22

Intern-S1: A Scientific Multimodal Foundation Model

Summary

This paper introduces Intern-S1, a new artificial intelligence model designed to be really good at both general tasks and specifically at complex scientific problems.

What's the problem?

While AI models have gotten incredibly good at things like writing and creating images, they still struggle with specialized scientific fields like chemistry and materials science. Existing open-source AI models don't perform as well as the more powerful, but often private, AI systems used by experts in these areas, creating a gap in capability and hindering scientific progress.

What's the solution?

The researchers created Intern-S1, a large AI model with a special structure called a 'Mixture-of-Experts' that allows it to handle a wide range of information. They trained it on a massive amount of text data, including a huge collection of scientific papers and data. Then, they used a clever training technique called 'Mixture-of-Rewards' to fine-tune the model on over a thousand different tasks simultaneously, making it better at reasoning and problem-solving.

Why it matters?

Intern-S1 represents a significant step forward in creating AI that can truly assist scientists. It performs competitively with other open-source models on general tasks, but importantly, it surpasses them – and even some closed-source models – in challenging scientific areas like designing molecules and predicting material properties. This could accelerate research and potentially lead to breakthroughs in various scientific fields, and moves us closer to more generally intelligent AI.

Abstract

In recent years, a plethora of open-source foundation models have emerged, achieving remarkable progress in some widely attended fields, with performance being quite close to that of closed-source models. However, in high-value but more challenging scientific professional fields, either the fields still rely on expert models, or the progress of general foundation models lags significantly compared to those in popular areas, far from sufficient for transforming scientific research and leaving substantial gap between open-source models and closed-source models in these scientific domains. To mitigate this gap and explore a step further toward Artificial General Intelligence (AGI), we introduce Intern-S1, a specialized generalist equipped with general understanding and reasoning capabilities with expertise to analyze multiple science modal data. Intern-S1 is a multimodal Mixture-of-Experts (MoE) model with 28 billion activated parameters and 241 billion total parameters, continually pre-trained on 5T tokens, including over 2.5T tokens from scientific domains. In the post-training stage, Intern-S1 undergoes offline and then online reinforcement learning (RL) in InternBootCamp, where we propose Mixture-of-Rewards (MoR) to synergize the RL training on more than 1000 tasks simultaneously. Through integrated innovations in algorithms, data, and training systems, Intern-S1 achieved top-tier performance in online RL training.On comprehensive evaluation benchmarks, Intern-S1 demonstrates competitive performance on general reasoning tasks among open-source models and significantly outperforms open-source models in scientific domains, surpassing closed-source state-of-the-art models in professional tasks, such as molecular synthesis planning, reaction condition prediction, predicting thermodynamic stabilities for crystals. Our models are available at https://huggingface.co/internlm/Intern-S1.

View Paper