< Explain other AI papers

MotionShop: Zero-Shot Motion Transfer in Video Diffusion Models with Mixture of Score Guidance

Hidir Yesiltepe, Tuna Han Salih Meral, Connor Dunlop, Pinar Yanardag

2024-12-10

MotionShop: Zero-Shot Motion Transfer in Video Diffusion Models with Mixture of Score Guidance

Summary

This paper talks about MotionShop, a new method for transferring motion in videos using a technique called Mixture of Score Guidance (MSG), which allows for creative and flexible video transformations without needing extra training.

What's the problem?

Existing methods for transferring motion in videos often require a lot of training data and struggle to maintain the quality of the original scene while changing the movements of objects. This makes it hard to create realistic and dynamic video content that can adapt to different situations or objects.

What's the solution?

The authors introduce MotionShop, which uses MSG to separate the motion and content of videos. This allows the model to transfer motion patterns from one video to another while keeping the background and other elements intact. MotionShop can handle various scenarios, such as transferring motion from a single object to multiple objects and even managing complex camera movements. They also created a dataset called MotionBench to test their method, which includes many examples of motion transfers.

Why it matters?

This research is important because it advances the field of video generation by making it easier to create dynamic and engaging content. By allowing for zero-shot motion transfer, MotionShop can help filmmakers, game developers, and content creators produce high-quality videos more efficiently, enhancing creativity and storytelling in visual media.

Abstract

In this work, we propose the first motion transfer approach in diffusion transformer through Mixture of Score Guidance (MSG), a theoretically-grounded framework for motion transfer in diffusion models. Our key theoretical contribution lies in reformulating conditional score to decompose motion score and content score in diffusion models. By formulating motion transfer as a mixture of potential energies, MSG naturally preserves scene composition and enables creative scene transformations while maintaining the integrity of transferred motion patterns. This novel sampling operates directly on pre-trained video diffusion models without additional training or fine-tuning. Through extensive experiments, MSG demonstrates successful handling of diverse scenarios including single object, multiple objects, and cross-object motion transfer as well as complex camera motion transfer. Additionally, we introduce MotionBench, the first motion transfer dataset consisting of 200 source videos and 1000 transferred motions, covering single/multi-object transfers, and complex camera motions.