HoloScene: Simulation-Ready Interactive 3D Worlds from a Single Video

Hongchi Xia, Chih-Hao Lin, Hao-Yu Hsu, Quentin Leboutet, Katelyn Gao, Michael Paulitsch, Benjamin Ummenhofer, Shenlong Wang

2025-10-08

HoloScene: Simulation-Ready Interactive 3D Worlds from a Single Video

Summary

This paper introduces HoloScene, a new system for creating incredibly detailed and realistic 3D models of real-world environments. Think of it as a way to turn a real room or object into a fully interactive digital copy you can use in things like video games or virtual reality.

What's the problem?

Currently, making these 3D models is hard. Existing methods often struggle to create models that are both visually accurate *and* physically realistic. They might get the shape right, but the objects won't behave like they would in real life – they might fall through floors or not react correctly when touched. It's difficult to get everything right: complete shapes, how objects interact, realistic appearance, and believable physics.

What's the solution?

HoloScene solves this by building a detailed 'scene graph' which is like a blueprint of the environment. This blueprint doesn't just store the shape and color of objects, but also information about their physical properties and how they relate to each other. The system then uses a clever combination of techniques – exploring possibilities and refining them with calculations – to build the 3D model. It’s like sculpting, starting with a rough shape and then carefully adding details until it’s perfect.

Why it matters?

This is important because it allows for much more realistic and immersive experiences in virtual and augmented reality, gaming, and robotics. Imagine a video game where you can interact with objects just like you would in real life, or a robot that can navigate a virtual replica of your home before entering the real one. HoloScene brings us closer to that level of realism and opens up new possibilities for these technologies.

Abstract

Digitizing the physical world into accurate simulation-ready virtual environments offers significant opportunities in a variety of fields such as augmented and virtual reality, gaming, and robotics. However, current 3D reconstruction and scene-understanding methods commonly fall short in one or more critical aspects, such as geometry completeness, object interactivity, physical plausibility, photorealistic rendering, or realistic physical properties for reliable dynamic simulation. To address these limitations, we introduce HoloScene, a novel interactive 3D reconstruction framework that simultaneously achieves these requirements. HoloScene leverages a comprehensive interactive scene-graph representation, encoding object geometry, appearance, and physical properties alongside hierarchical and inter-object relationships. Reconstruction is formulated as an energy-based optimization problem, integrating observational data, physical constraints, and generative priors into a unified, coherent objective. Optimization is efficiently performed via a hybrid approach combining sampling-based exploration with gradient-based refinement. The resulting digital twins exhibit complete and precise geometry, physical stability, and realistic rendering from novel viewpoints. Evaluations conducted on multiple benchmark datasets demonstrate superior performance, while practical use-cases in interactive gaming and real-time digital-twin manipulation illustrate HoloScene's broad applicability and effectiveness. Project page: https://xiahongchi.github.io/HoloScene.

View Paper