Shape-for-Motion: Precise and Consistent Video Editing with 3D Proxy
Yuhao Liu, Tengfei Wang, Fang Liu, Zhenwei Wang, Rynson W. H. Lau
2025-06-30
Summary
This paper talks about Shape-for-Motion, a new video editing system that uses 3D proxy meshes combined with a special video diffusion model to allow very precise and consistent edits to objects in videos.
What's the problem?
Editing videos frame by frame in 2D often causes mistakes and inconsistencies because it’s hard to keep everything aligned and natural over time, especially when making detailed changes to objects.
What's the solution?
The authors created a framework that first converts the object in the video into a time-consistent 3D mesh, which acts like a 3D model proxy. Users make edits on this single 3D proxy mesh, and the changes automatically carry over to all frames. These edits are then turned back into video using a special diffusion model that blends geometry and texture smoothly and consistently across the entire video.
Why it matters?
This matters because it introduces a way to edit videos that is much more accurate and stable over time than traditional methods. It can be used for high-quality video production, making complex edits easier and more reliable for creators.
Abstract
A novel framework integrates 3D proxy meshes and a decoupled video diffusion model to achieve precise and consistent video editing.