Towards Fully-Automated Materials Discovery via Large-Scale Synthesis Dataset and Expert-Level LLM-as-a-Judge

Heegyu Kim, Taeyang Jeon, Seungtaek Choi, Jihoon Hong, Dongwon Jeon, Sungbum Cho, Ga-Yeon Baek, Kyung-Won Kwak, Dong-Hee Lee, Sun-Jin Choi, Jisu Bae, Chihoon Lee, Yunseo Kim, Jinsung Park, Hyunsouk Cho

2025-02-24

Towards Fully-Automated Materials Discovery via Large-Scale Synthesis
Dataset and Expert-Level LLM-as-a-Judge

Summary

This paper talks about AlchemyBench, a new tool that uses artificial intelligence to help scientists create new materials more efficiently by learning from a large collection of expert-verified recipes for making materials.

What's the problem?

Creating new materials is crucial for developing better technologies, but it's often a slow process that relies on scientists trying many different combinations through trial and error. This takes a lot of time and resources, and success often depends on the scientist's experience and intuition.

What's the solution?

The researchers created AlchemyBench, which includes a database of 17,000 recipes for making materials, collected from scientific literature. They also developed a system that uses large language models (advanced AI) to understand these recipes and help predict how to make new materials. The AI can suggest what ingredients and equipment to use, how to combine them, and what the result might be. They even taught the AI to evaluate its own predictions, similar to how an expert scientist would.

Why it matters?

This matters because it could speed up the discovery of new materials that we need for things like better batteries, faster computers, and more effective medical treatments. By using AI to learn from thousands of existing recipes, scientists can get ideas for new materials more quickly and with fewer failed experiments. This could lead to faster scientific progress and new technologies that improve our lives.

Abstract

Materials synthesis is vital for innovations such as energy storage, catalysis, electronics, and biomedical devices. Yet, the process relies heavily on empirical, trial-and-error methods guided by expert intuition. Our work aims to support the materials science community by providing a practical, data-driven resource. We have curated a comprehensive dataset of 17K expert-verified synthesis recipes from open-access literature, which forms the basis of our newly developed benchmark, AlchemyBench. AlchemyBench offers an end-to-end framework that supports research in large language models applied to synthesis prediction. It encompasses key tasks, including raw materials and equipment prediction, synthesis procedure generation, and characterization outcome forecasting. We propose an LLM-as-a-Judge framework that leverages large language models for automated evaluation, demonstrating strong statistical agreement with expert assessments. Overall, our contributions offer a supportive foundation for exploring the capabilities of LLMs in predicting and guiding materials synthesis, ultimately paving the way for more efficient experimental design and accelerated innovation in materials science.

View Paper