HiScene: Creating Hierarchical 3D Scenes with Isometric View Generation
Wenqi Dong, Bangbang Yang, Zesong Yang, Yuan Li, Tao Hu, Hujun Bao, Yuewen Ma, Zhaopeng Cui
2025-04-21
Summary
This paper talks about HiScene, a new system that can build detailed and interactive 3D scenes by combining 2D pictures with 3D models, making the scenes look realistic and complete from any angle.
What's the problem?
The problem is that creating 3D scenes that look good and make sense from every viewpoint is really hard, especially when you only have flat 2D images to start with. Most existing methods struggle to fill in the gaps for parts of objects you can’t see, and the final 3D scenes often look messy or unrealistic.
What's the solution?
The researchers developed HiScene, which uses a special layered approach to first generate 2D images and then build up 3D objects using advanced AI techniques like video-diffusion and shape priors. This allows the system to guess and fill in hidden parts of objects, making the whole scene look natural and allowing users to interact with it from different perspectives, including isometric views.
Why it matters?
This matters because it makes it much easier to create high-quality 3D environments for things like video games, virtual reality, and digital art, helping artists and developers bring their ideas to life in a more realistic and interactive way.
Abstract
A hierarchical framework called HiScene combines 2D image and 3D object generation using video-diffusion for amodal completion and shape priors to produce coherent, high-fidelity, interactive 3D scenes.