ReDirector: Creating Any-Length Video Retakes with Rotary Camera Encoding

Byeongjun Park, Byung-Hoon Kim, Hyungjin Chung, Jong Chul Ye

2025-11-26

ReDirector: Creating Any-Length Video Retakes with Rotary Camera Encoding

Summary

This paper introduces a new technique called ReDirector that lets you easily edit and re-shoot parts of videos, even if the camera is moving. It focuses on making video editing more controllable and realistic.

What's the problem?

Previous methods for re-shooting video parts struggled when the camera was moving in complex ways or when the video length changed. They often misused a technique called RoPE, which caused issues with aligning the original video and the edited version, leading to blurry or inconsistent results, especially with moving objects and static backgrounds.

What's the solution?

ReDirector solves this by fixing how RoPE is used, ensuring the original and edited videos line up correctly in both space and time. They also created a new method called Rotary Camera Encoding (RoCE) which essentially tells the system where the camera is and how it's moving. This allows the system to understand the relationships between different viewpoints in the video and create more realistic edits, even with camera movements it hasn't seen before.

Why it matters?

This research is important because it makes video editing much more flexible and user-friendly. It allows for more precise control over edits, especially in dynamic scenes, and improves the quality of the final result. This could be useful for filmmakers, content creators, and anyone who needs to edit videos with complex camera movements.

Abstract

We present ReDirector, a novel camera-controlled video retake generation method for dynamically captured variable-length videos. In particular, we rectify a common misuse of RoPE in previous works by aligning the spatiotemporal positions of the input video and the target retake. Moreover, we introduce Rotary Camera Encoding (RoCE), a camera-conditioned RoPE phase shift that captures and integrates multi-view relationships within and across the input and target videos. By integrating camera conditions into RoPE, our method generalizes to out-of-distribution camera trajectories and video lengths, yielding improved dynamic object localization and static background preservation. Extensive experiments further demonstrate significant improvements in camera controllability, geometric consistency, and video quality across various trajectories and lengths.

View Paper