LeetCodeDataset: A Temporal Dataset for Robust Evaluation and Efficient Training of Code LLMs

Yunhui Xia, Wei Shen, Yan Wang, Jason Klein Liu, Huifeng Sun, Siyue Wu, Jian Hu, Xiaolong Xu

2025-04-22

LeetCodeDataset: A Temporal Dataset for Robust Evaluation and Efficient
Training of Code LLMs

Summary

This paper talks about LeetCodeDataset, a new collection of coding problems designed to help test and train AI models that write computer code, especially focusing on problems that require logical thinking and step-by-step reasoning.

What's the problem?

The problem is that most datasets used to train and evaluate code-writing AI models are either too simple or don’t really test the model’s ability to solve problems that involve complex reasoning, which is important for real-world programming tasks.

What's the solution?

The researchers created LeetCodeDataset, which includes a wide range of coding problems that are organized over time and focus on challenging reasoning skills. This dataset makes it easier to both test how good a code model is and to train it to get better at solving tough, realistic problems using supervised fine-tuning.

Why it matters?

This matters because it helps developers build smarter AI that can actually solve real coding challenges, making these tools more useful for students, programmers, and anyone who needs help with complex coding tasks.

Abstract

LeetCodeDataset provides a benchmark for evaluating and training code-generation models with reasoning-focused coding tasks and supports efficient supervised fine-tuning.

View Paper