HoloTime: Taming Video Diffusion Models for Panoramic 4D Scene Generation
Haiyang Zhou, Wangbo Yu, Jiawen Guan, Xinhua Cheng, Yonghong Tian, Li Yuan
2025-05-07
Summary
This paper talks about HoloTime, a new system that creates super realistic 360-degree videos that you can move around in, making it feel like you're actually inside the scene for virtual reality and augmented reality experiences.
What's the problem?
Making high-quality, immersive 4D scenes from regular panoramic images is really hard because it's tough to keep the motion smooth, the details sharp, and everything lined up correctly in both space and time, especially when you want the scene to look good from every angle.
What's the solution?
The researchers built a two-stage model that first turns panoramic images into smooth panoramic videos using a special animation process, and then reconstructs those videos into detailed 4D scenes by estimating the depth and structure over time, so everything stays consistent and realistic as you move around.
Why it matters?
This matters because it allows for the creation of much more immersive and lifelike VR and AR experiences, letting people explore digital worlds that look and feel real, which is important for gaming, education, and many creative applications.
Abstract
HoloTime uses a two-stage panoramic diffusion model and a space-time depth estimation method to generate high-fidelity 4D assets for immersive VR and AR experiences.