Key Features

Feedforward subject-driven video customization
Multimodal control conditions
Composes different input signals to customize a video
Enables more-subject customization in inference
Extracts guidance from control signals
Flexibly composes different control conditions
Performs text-to-4D generation
Dynamic multi-view generation

OmniVCus has been demonstrated to achieve impressive results in various video customization tasks, including single-subject video customization, instructive editing subject-driven video customization, double-subject video customization, zero-shot more-subject video customization, camera-controlled subject-driven video customization, depth-controlled subject-driven video customization, and mask-controlled subject-driven video customization. The method can also flexibly compose different control conditions to customize the video, and has been shown to outperform state-of-the-art methods in comparison examples.


In addition to its video customization capabilities, OmniVCus also has the ability to perform text-to-4D generation, which refers to dynamic multi-view generation. This is achieved by mixing the model with customization data and text-to-3D data. The method has been demonstrated to generate impressive text-to-4D results, including a baby dragon hatching out of a stone egg, a building on fire, and a clown fish swimming through the coral reef.

Get more likes & reach the top of search results by adding this button on your site!

Embed button preview - Light theme
Embed button preview - Dark theme
TurboType Banner

Subscribe to the AI Search Newsletter

Get top updates in AI to your inbox every weekend. It's free!