Key Features

Transfers character motion end to end without mandatory skeleton-map intermediates.
Supports single-character animation, multi-character scenes, and cross-identity character replacement.
Uses driving-video latent concatenation to retain rich motion and appearance cues.
Introduces in-context mask conditioning for unified soft guidance across tasks.
Uses mode-specific RoPE to separate task modes within one framework.
Builds on MotionPair-60K, a synthetic dataset spanning multiple character animation subtasks.
Applies Bias-Aware DPO to reduce synthetic-data artifacts in detailed regions.
Provides public paper, code, model assets, and direct demo videos.

The system unifies single-character animation, multi-character animation, character replacement, and zero-shot animation under an end-to-end in-context conditioning design. Its project page describes mode-specific RoPE, in-context mask conditioning, synthetic MotionPair-60K data, and Bias-Aware DPO refinement for detailed regions such as fingers.


SCAIL-2 is useful for researchers and creators who need controllable character motion transfer across identities, multiple characters, and unusual driving sources. Public links to arXiv, code, and Hugging Face make it practical to inspect, reproduce, and build on the method.

Get more likes & reach the top of search results by adding this button on your site!

Embed button preview - Light theme
Embed button preview - Dark theme
TurboType Banner
Zero to AI Engineer Program

Zero to AI Engineer

Skip the degree. Learn real-world AI skills used by AI researchers and engineers. Get certified in 8 weeks or less. No experience required.

Subscribe to the AI Search Newsletter

Get top updates in AI to your inbox every weekend. It's free!