In-2-4D: Inbetweening from Two Single-View Images to 4D Generation
Sauradip Nag, Daniel Cohen-Or, Hao Zhang, Ali Mahdavi-Amiri
2025-04-14
Summary
This paper talks about In-2-4D, a new technique that can create smooth animations and motion between two regular images, as if you’re filling in the action or movement that would naturally happen between them. It does this using advanced AI methods that consider both the 3D shape and how things change over time, resulting in realistic 4D sequences.
What's the problem?
The problem is that making smooth animations or transitions between just two images is really hard, especially if you want the results to look natural and capture how objects move and change in 3D space over time. Most existing methods either look choppy, unrealistic, or need a lot more images or data to work well.
What's the solution?
The researchers developed a special approach that combines several advanced AI techniques, like Gaussian Splatting, multi-view diffusion, and self-attention, to fill in all the missing frames between the two images. This method can figure out both the shapes and the motion, creating a full 4D animation (which means 3D plus time) from just two pictures.
Why it matters?
This work matters because it makes it possible to create smooth, realistic animations from just a couple of photos, which could be super useful for movies, games, virtual reality, or even bringing old family pictures to life. It opens up new creative possibilities and makes high-quality animation more accessible to everyone.
Abstract
A hierarchical 4D inbetweening approach generates smooth motion from two input images using Gaussian Splatting, multi-view diffusion, and self-attention across timesteps.