Phantom-Data : Towards a General Subject-Consistent Video Generation Dataset
Zhuowei Chen, Bingchuan Li, Tianxiang Ma, Lijie Liu, Mingcong Liu, Yi Zhang, Gen Li, Xinghui Li, Siyu Zhou, Qian He, Xinglong Wu
2025-06-24
Summary
This paper talks about Phantom-Data, a large new dataset created to improve AI models that generate videos showing the same subject consistently, even when the subject appears in different scenes or contexts.
What's the problem?
The problem is that existing AI video models often just copy and paste the subject's appearance from one scene to another, which limits their ability to follow detailed text instructions or create realistic, varied videos while keeping the subject's identity consistent.
What's the solution?
The researchers built Phantom-Data by collecting and carefully verifying about one million pairs of images and videos where the same subject appears in very different settings. This helps AI learn to keep the subject's identity steady across videos while allowing changes in background and context, improving both prompt alignment and visual quality.
Why it matters?
This matters because it allows AI to create more diverse and realistic videos with consistent subjects, helping applications like movie making, virtual avatars, and creative content generation work better and more convincingly.
Abstract
A cross-pair dataset called Phantom-Data improves subject-to-video generation by enhancing prompt alignment and visual quality while maintaining identity consistency.