Key Features

Unified framework for pose-driven character animation and image pose transfer that works with a single reference and multiple motion sources.[web:1]
Alignment-free handling of references with arbitrary layouts and mismatched skeletal structures, reducing the need for carefully prepared reference–pose pairs.[web:1]
Self-supervised outpainting-based training that converts diverse reference layouts into a unified occluded-input format for robust generation.[web:1]
Reference extractor module that focuses on comprehensive identity feature extraction from partially visible or occluded references.[web:1]
Hybrid reference fusion attention mechanism to progressively inject reference features while supporting variable resolutions and dynamic sequence lengths in videos.[web:1]
Identity-robust pose control that decouples appearance from pose to mitigate overfitting to specific skeletal configurations.[web:1]
Token replace strategy for temporally coherent long video generation, helping maintain identity consistency over many frames.[web:1]
Support for cross-scale video animation and cross-scale image pose transfer, allowing a single reference image to be driven by arbitrary dance or motion videos across different spatial scales.[web:1]

The framework introduces several key technical components to handle the difficult case of misaligned or partially visible references. During training, One-to-All Animation reformulates the task as a self-supervised outpainting problem, where the model learns to transform diverse-layout reference inputs into a unified occluded-input representation and then generate the full character conditioned on driving poses. A dedicated reference extractor module is used to capture comprehensive identity features from incomplete or occluded reference regions, and these features are injected progressively through a hybrid reference fusion attention mechanism that can flexibly accommodate variable resolutions and dynamic sequence lengths in videos.[web:1]


From a control and quality perspective, One-to-All Animation incorporates an identity-robust pose control strategy that decouples appearance from skeletal structure to alleviate pose overfitting and reduce artifacts when the driving motion deviates strongly from the reference body configuration. In addition, a token replace strategy is applied for long video generation, which helps maintain temporal consistency and avoid identity drift over extended sequences. Extensive experiments reported by the authors indicate that this approach outperforms existing pose-driven animation baselines across cross-scale video animation, cross-scale image pose transfer, and long-form video generation, enabling a single character reference to be animated convincingly by multiple motions at different spatial scales.[web:1]

Get more likes & reach the top of search results by adding this button on your site!

Embed button preview - Light theme
Embed button preview - Dark theme
TurboType Banner

Subscribe to the AI Search Newsletter

Get top updates in AI to your inbox every weekend. It's free!