LayerPano3D: Layered 3D Panorama for Hyper-Immersive Scene Generation

Shuai Yang, Jing Tan, Mengchen Zhang, Tong Wu, Yixuan Li, Gordon Wetzstein, Ziwei Liu, Dahua Lin

2024-08-26

LayerPano3D: Layered 3D Panorama for Hyper-Immersive Scene Generation

Summary

This paper introduces LayerPano3D, a new method for creating detailed and immersive 3D scenes from a single image, allowing users to explore these scenes freely.

What's the problem?

Creating realistic 3D scenes is difficult because existing methods often struggle with maintaining consistent views from all angles and can’t handle complex scene details well. They either expand scenes in a way that loses important information or can’t effectively show overlapping objects.

What's the solution?

The authors developed LayerPano3D, which breaks down a 2D panorama (a wide view image) into multiple layers at different depths. This allows the system to reveal hidden parts of the scene and create a more complete 3D representation. They also introduced a special process to generate high-quality panoramic images based on text descriptions, which helps manage complicated scene structures. Their method allows for detailed exploration of 3D environments.

Why it matters?

This research is significant because it improves how we generate and interact with 3D scenes, making it easier to create realistic virtual environments for applications like video games, virtual reality, and simulations. By enhancing the quality and detail of these scenes, LayerPano3D could lead to more engaging experiences in various digital media.

Abstract

3D immersive scene generation is a challenging yet critical task in computer vision and graphics. A desired virtual 3D scene should 1) exhibit omnidirectional view consistency, and 2) allow for free exploration in complex scene hierarchies. Existing methods either rely on successive scene expansion via inpainting or employ panorama representation to represent large FOV scene environments. However, the generated scene suffers from semantic drift during expansion and is unable to handle occlusion among scene hierarchies. To tackle these challenges, we introduce LayerPano3D, a novel framework for full-view, explorable panoramic 3D scene generation from a single text prompt. Our key insight is to decompose a reference 2D panorama into multiple layers at different depth levels, where each layer reveals the unseen space from the reference views via diffusion prior. LayerPano3D comprises multiple dedicated designs: 1) we introduce a novel text-guided anchor view synthesis pipeline for high-quality, consistent panorama generation. 2) We pioneer the Layered 3D Panorama as underlying representation to manage complex scene hierarchies and lift it into 3D Gaussians to splat detailed 360-degree omnidirectional scenes with unconstrained viewing paths. Extensive experiments demonstrate that our framework generates state-of-the-art 3D panoramic scene in both full view consistency and immersive exploratory experience. We believe that LayerPano3D holds promise for advancing 3D panoramic scene creation with numerous applications.

View Paper