MDAgent2: Large Language Model for Code Generation and Knowledge Q&A in Molecular Dynamics

Zhuofan Shi, Hubao A, Yufei Shao, Mengyan Dai, Yadong Yu, Pan Xiang, Dongliang Huang, Hongxu An, Chunxiao Xin, Haiyang Shen, Zhenyu Wang, Yunshan Na, Gang Huang, Xiang Jing

2026-01-08

MDAgent2: Large Language Model for Code Generation and Knowledge Q&A in Molecular Dynamics

Summary

This paper introduces MDAgent2, a new system that uses large language models (LLMs) to automatically write and run simulations for materials science. It aims to make these complex simulations easier to set up and use, even for people who aren't experts in coding the simulation software.

What's the problem?

Currently, creating simulations of how atoms behave in materials, using software like LAMMPS, requires specialized coding skills and takes a lot of time. While LLMs are getting good at writing code generally, they struggle with these kinds of scientific simulations because there isn't much existing data specifically for this field, the best LLMs are expensive to use, and the code they generate often doesn't actually run correctly.

What's the solution?

The researchers built MDAgent2, which works in a few steps. First, they created high-quality datasets specifically for materials science simulations, covering knowledge about the field, question answering, and code generation. Then, they trained two LLMs, MD-Instruct and MD-Code, using a three-step process: first getting them familiar with the general language of the field, then teaching them specific tasks, and finally using a 'reward' system based on how well the simulations actually run. They also created a system called MDAgent2-RUNTIME that automatically generates code, runs it, checks the results, and tries to fix any errors. Finally, they created a benchmark to test how well their system performs.

Why it matters?

This work shows that LLMs can be successfully adapted to handle complex, real-world scientific simulations. It provides a method for automatically generating code for these simulations, which could speed up research and development in materials science and other fields. It's a step towards using AI to automate more of the scientific process, making it more efficient and accessible.

Abstract

Molecular dynamics (MD) simulations are essential for understanding atomic-scale behaviors in materials science, yet writing LAMMPS scripts remains highly specialized and time-consuming tasks. Although LLMs show promise in code generation and domain-specific question answering, their performance in MD scenarios is limited by scarce domain data, the high deployment cost of state-of-the-art LLMs, and low code executability. Building upon our prior MDAgent, we present MDAgent2, the first end-to-end framework capable of performing both knowledge Q&A and code generation within the MD domain. We construct a domain-specific data-construction pipeline that yields three high-quality datasets spanning MD knowledge, question answering, and code generation. Based on these datasets, we adopt a three stage post-training strategy--continued pre-training (CPT), supervised fine-tuning (SFT), and reinforcement learning (RL)--to train two domain-adapted models, MD-Instruct and MD-Code. Furthermore, we introduce MD-GRPO, a closed-loop RL method that leverages simulation outcomes as reward signals and recycles low-reward trajectories for continual refinement. We further build MDAgent2-RUNTIME, a deployable multi-agent system that integrates code generation, execution, evaluation, and self-correction. Together with MD-EvalBench proposed in this work, the first benchmark for LAMMPS code generation and question answering, our models and system achieve performance surpassing several strong baselines.This work systematically demonstrates the adaptability and generalization capability of large language models in industrial simulation tasks, laying a methodological foundation for automatic code generation in AI for Science and industrial-scale simulations. URL: https://github.com/FredericVAN/PKU_MDAgent2

View Paper