Follow-Your-Canvas: Higher-Resolution Video Outpainting with Extensive Content Generation

Qihua Chen, Yue Ma, Hongfa Wang, Junkun Yuan, Wenzhe Zhao, Qi Tian, Hongmei Wang, Shaobo Min, Qifeng Chen, Wei Liu

2024-09-04

Follow-Your-Canvas: Higher-Resolution Video Outpainting with Extensive Content Generation

Summary

This paper talks about Follow-Your-Canvas, a new method for generating higher-resolution videos by extending existing video content with more details.

What's the problem?

Current methods for video outpainting often produce low-quality results and are limited by the amount of memory available on graphics processing units (GPUs). This makes it difficult to create high-resolution videos that look good and maintain consistency throughout.

What's the solution?

Follow-Your-Canvas addresses these issues by using a diffusion-based method that breaks the outpainting process into smaller parts. Instead of trying to generate the entire video at once, it works on smaller sections and then combines them seamlessly. It also incorporates information from the original video to ensure that the new content matches well with what already exists. This allows for the creation of larger and higher-quality videos without being restricted by GPU memory.

Why it matters?

This research is important because it enhances the ability to create detailed and visually appealing videos, which can be useful in various fields such as film, advertising, and virtual reality. By improving the quality of video generation, Follow-Your-Canvas can help creators produce better content more efficiently.

Abstract

This paper explores higher-resolution video outpainting with extensive content generation. We point out common issues faced by existing methods when attempting to largely outpaint videos: the generation of low-quality content and limitations imposed by GPU memory. To address these challenges, we propose a diffusion-based method called Follow-Your-Canvas. It builds upon two core designs. First, instead of employing the common practice of "single-shot" outpainting, we distribute the task across spatial windows and seamlessly merge them. It allows us to outpaint videos of any size and resolution without being constrained by GPU memory. Second, the source video and its relative positional relation are injected into the generation process of each window. It makes the generated spatial layout within each window harmonize with the source video. Coupling with these two designs enables us to generate higher-resolution outpainting videos with rich content while keeping spatial and temporal consistency. Follow-Your-Canvas excels in large-scale video outpainting, e.g., from 512X512 to 1152X2048 (9X), while producing high-quality and aesthetically pleasing results. It achieves the best quantitative results across various resolution and scale setups. The code is released on https://github.com/mayuelala/FollowYourCanvas

View Paper