Open-Sora 2.0: Training a Commercial-Level Video Generation Model in $200k
Xiangyu Peng, Zangwei Zheng, Chenhui Shen, Tom Young, Xinying Guo, Binluo Wang, Hang Xu, Hongxin Liu, Mingyan Jiang, Wenjun Li, Yuhui Wang, Anbang Ye, Gang Ren, Qianran Ma, Wanying Liang, Xiang Lian, Xiwen Wu, Yuting Zhong, Zhuangyan Li, Chaoyu Gong, Guojun Lei, Leijun Cheng
2025-03-14
Summary
This paper talks about Open-Sora 2.0, a video generation model that achieves high-quality results while being trained on a budget of just $200k. It focuses on making advanced video generation technology more affordable and accessible.
What's the problem?
Video generation models have made great progress, but they often require expensive resources like large models, massive datasets, and high computing power. This makes it difficult for smaller organizations or individuals to use them effectively.
What's the solution?
The researchers developed Open-Sora 2.0 by using efficient techniques such as better data selection, optimized model design, smarter training strategies, and system improvements. These methods allowed them to train the model at a much lower cost while still achieving results comparable to leading global models.
Why it matters?
This work matters because it shows that cutting-edge video generation technology can be made more affordable and widely available. By open-sourcing Open-Sora 2.0, the researchers aim to encourage innovation and creativity in video content creation for everyone.
Abstract
Video generation models have achieved remarkable progress in the past year. The quality of AI video continues to improve, but at the cost of larger model size, increased data quantity, and greater demand for training compute. In this report, we present Open-Sora 2.0, a commercial-level video generation model trained for only $200k. With this model, we demonstrate that the cost of training a top-performing video generation model is highly controllable. We detail all techniques that contribute to this efficiency breakthrough, including data curation, model architecture, training strategy, and system optimization. According to human evaluation results and VBench scores, Open-Sora 2.0 is comparable to global leading video generation models including the open-source HunyuanVideo and the closed-source Runway Gen-3 Alpha. By making Open-Sora 2.0 fully open-source, we aim to democratize access to advanced video generation technology, fostering broader innovation and creativity in content creation. All resources are publicly available at: https://github.com/hpcaitech/Open-Sora.