Time-to-Move

NEW

Free Video Motion Control

LikeWebsite Promote

Key Features

Training-free and plug-and-play framework

Adds precise motion control to existing video diffusion models

Uses crude reference animations as coarse motion cues

Adapts the mechanism of SDEdit to the video domain

Enables joint control over both motion and appearance

Preserves input details and faithfully follows the motion

Generates realistic videos without extra training or architectural changes

Flexible approach yields realistic dynamics without artifacts

TTM takes an input image and a user-specified motion, then automatically builds a coarse warped reference video and a mask marking the controlled region. The image-to-video diffusion model is conditioned on the clean input image and initialized from a noisy version of the warped reference, anchoring appearance while injecting the intended motion. During sampling, dual-clock denoising is applied to enforce the commanded motion and enable natural dynamics.

Time-to-Move enables joint control over both motion and appearance, allowing for the insertion of new objects from outside the original image and the modification of an existing object’s appearance. Experiments demonstrate that TTM achieves comparable or superior performance to training-based baselines in both realism and motion fidelity. This flexible approach yields realistic dynamics without artifacts, making it a powerful tool for video generation and manipulation.

Get more likes & reach the top of search results by adding this button on your site!

Time-to-Move

Key Features

Subscribe to the AI Search Newsletter