Executable Knowledge Graphs for Replicating AI Research
Yujie Luo, Zhuoyun Yu, Xuehai Wang, Yuqi Zhu, Ningyu Zhang, Lanning Wei, Lun Du, Da Zheng, Huajun Chen
2025-10-21
Summary
This paper addresses the difficulty of automatically replicating research in artificial intelligence, specifically when that research involves large language models. It introduces a new system designed to help AI agents understand and execute the steps described in scientific papers.
What's the problem?
Currently, AI agents struggle to actually *do* the experiments described in AI research papers. They often can't generate working code because they lack the necessary detailed knowledge and existing methods for pulling information from papers (called retrieval-augmented generation) miss important technical specifics. These methods also don't pay enough attention to actual code examples within the papers and don't organize information in a way that makes it easy to find and reuse at different levels of detail.
What's the solution?
The researchers created something called Executable Knowledge Graphs, or xKG. Think of it as a special database that automatically gathers technical details, code snippets, and specialized knowledge from research papers. It's designed to be added to existing AI agent systems. By testing xKG with different AI frameworks and language models, they showed it significantly improved performance on a benchmark task called PaperBench.
Why it matters?
This work is important because it makes it easier to automatically reproduce AI research. Being able to reliably replicate results is a cornerstone of the scientific method, and this system helps to automate that process, potentially speeding up progress in the field of artificial intelligence. It provides a general solution that can be used with various AI agents and language models.
Abstract
Replicating AI research is a crucial yet challenging task for large language model (LLM) agents. Existing approaches often struggle to generate executable code, primarily due to insufficient background knowledge and the limitations of retrieval-augmented generation (RAG) methods, which fail to capture latent technical details hidden in referenced papers. Furthermore, previous approaches tend to overlook valuable implementation-level code signals and lack structured knowledge representations that support multi-granular retrieval and reuse. To overcome these challenges, we propose Executable Knowledge Graphs (xKG), a modular and pluggable knowledge base that automatically integrates technical insights, code snippets, and domain-specific knowledge extracted from scientific literature. When integrated into three agent frameworks with two different LLMs, xKG shows substantial performance gains (10.9% with o3-mini) on PaperBench, demonstrating its effectiveness as a general and extensible solution for automated AI research replication. Code will released at https://github.com/zjunlp/xKG.