D5RL: Diverse Datasets for Data-Driven Deep Reinforcement Learning
Rafael Rafailov, Kyle Hatch, Anikait Singh, Laura Smith, Aviral Kumar, Ilya Kostrikov, Philippe Hansen-Estruch, Victor Kolev, Philip Ball, Jiajun Wu, Chelsea Finn, Sergey Levine
2024-08-19

Summary
This paper introduces D5RL, a new set of diverse datasets designed to help improve offline reinforcement learning (RL) algorithms by providing realistic challenges for robotic tasks.
What's the problem?
Training RL algorithms typically requires a lot of real-world data, which can be expensive and dangerous to collect. Existing benchmarks for testing these algorithms often do not reflect the complexity of real-world tasks, leading to models that may not perform well in practical situations.
What's the solution?
D5RL offers a variety of datasets that simulate realistic robotic tasks, such as controlling robots to manipulate objects or navigate environments. These datasets include different scenarios and challenges, allowing researchers to test their algorithms more effectively. By using these diverse datasets, the algorithms can learn from a wider range of experiences, making them more robust and capable when faced with real-world situations.
Why it matters?
This research is important because it provides a better way to train and evaluate RL algorithms without needing extensive real-world data. By focusing on realistic simulations, D5RL can help develop more effective robotic systems that can be used in various applications, from manufacturing to healthcare.
Abstract
Offline reinforcement learning algorithms hold the promise of enabling data-driven RL methods that do not require costly or dangerous real-world exploration and benefit from large pre-collected datasets. This in turn can facilitate real-world applications, as well as a more standardized approach to RL research. Furthermore, offline RL methods can provide effective initializations for online finetuning to overcome challenges with exploration. However, evaluating progress on offline RL algorithms requires effective and challenging benchmarks that capture properties of real-world tasks, provide a range of task difficulties, and cover a range of challenges both in terms of the parameters of the domain (e.g., length of the horizon, sparsity of rewards) and the parameters of the data (e.g., narrow demonstration data or broad exploratory data). While considerable progress in offline RL in recent years has been enabled by simpler benchmark tasks, the most widely used datasets are increasingly saturating in performance and may fail to reflect properties of realistic tasks. We propose a new benchmark for offline RL that focuses on realistic simulations of robotic manipulation and locomotion environments, based on models of real-world robotic systems, and comprising a variety of data sources, including scripted data, play-style data collected by human teleoperators, and other data sources. Our proposed benchmark covers state-based and image-based domains, and supports both offline RL and online fine-tuning evaluation, with some of the tasks specifically designed to require both pre-training and fine-tuning. We hope that our proposed benchmark will facilitate further progress on both offline RL and fine-tuning algorithms. Website with code, examples, tasks, and data is available at https://sites.google.com/view/d5rl/