LongAnimation proposes a dynamic global-local paradigm, dynamically extracting global color consistent features relevant to the current generation. The framework includes a SketchDiT, a Dynamic Global-Local Memory (DGLM), and a Color Consistency Reward. The SketchDiT captures hybrid reference features to support the DGLM module, which employs a long video understanding model to dynamically compress global historical features and adaptively fuse them with the current generation features. This approach enables LongAnimation to maintain both short-term and long-term color consistency for open-domain animation colorization tasks.
LongAnimation has been demonstrated to generate high-quality, long-term color-consistent videos with a high degree of freedom. It can freely change the colors of the reference image and generate videos with smooth transitions between frames. The framework has also been shown to be effective in text-guided background generation, generating long-term dynamic backgrounds for the foreground based on the provided prompt. This makes LongAnimation a valuable tool for animation production and other applications where color consistency is crucial.