< Explain other AI papers

Skyfall-GS: Synthesizing Immersive 3D Urban Scenes from Satellite Imagery

Jie-Ying Lee, Yi-Ruei Liu, Shr-Ruei Tsai, Wei-Cheng Chang, Chung-Ho Wu, Jiewen Chan, Zhenjun Zhao, Chieh Hubert Lin, Yu-Lun Liu

2025-10-20

Skyfall-GS: Synthesizing Immersive 3D Urban Scenes from Satellite Imagery

Summary

This paper introduces a new method, called Skyfall-GS, for automatically creating large, detailed, and realistic 3D city environments. It allows you to virtually explore these environments in real-time.

What's the problem?

Creating realistic 3D city models is really hard because it usually requires a lot of detailed, real-world scans which are expensive and time-consuming to get. Existing methods struggle to build large areas that look good from all angles and have realistic textures without this detailed scan data.

What's the solution?

The researchers found a way around needing those expensive scans. They combined readily available satellite images, which give a good overall shape of the city, with a powerful image generation technique called a diffusion model. This model then fills in the details, creating realistic textures and appearances for buildings and other objects. They also used a smart process that gradually improves the 3D model, making it more complete and realistic step-by-step.

Why it matters?

This work is important because it makes it much easier and cheaper to create large-scale 3D city environments. This opens up possibilities for things like realistic simulations, immersive video games, and better tools for urban planning and virtual tourism, all without needing costly and difficult-to-obtain 3D scans.

Abstract

Synthesizing large-scale, explorable, and geometrically accurate 3D urban scenes is a challenging yet valuable task in providing immersive and embodied applications. The challenges lie in the lack of large-scale and high-quality real-world 3D scans for training generalizable generative models. In this paper, we take an alternative route to create large-scale 3D scenes by synergizing the readily available satellite imagery that supplies realistic coarse geometry and the open-domain diffusion model for creating high-quality close-up appearances. We propose Skyfall-GS, the first city-block scale 3D scene creation framework without costly 3D annotations, also featuring real-time, immersive 3D exploration. We tailor a curriculum-driven iterative refinement strategy to progressively enhance geometric completeness and photorealistic textures. Extensive experiments demonstrate that Skyfall-GS provides improved cross-view consistent geometry and more realistic textures compared to state-of-the-art approaches. Project page: https://skyfall-gs.jayinnn.dev/