Constructing a 3D Town from a Single Image

Kaizhi Zheng, Ruijian Zhang, Jing Gu, Jie Yang, Xin Eric Wang

2025-05-22

Constructing a 3D Town from a Single Image

Summary

This paper talks about 3DTown, a new system that can create a detailed 3D model of a town just from one top-down picture, like a map or a satellite image, without needing any extra training.

What's the problem?

Usually, making realistic 3D models of places from only one image is really hard because you don't have enough information about what things look like from different angles, and most methods need lots of training data or extra photos.

What's the solution?

The researchers developed 3DTown, which uses smart techniques to divide the image into regions and fill in the missing 3D details, allowing it to build a full 3D scene from just a single picture.

Why it matters?

This matters because it makes it much easier and faster to create 3D models for things like video games, virtual reality, or urban planning, even when you only have one image to work with.

Abstract

A training-free framework named 3DTown generates realistic 3D scenes from a single top-down image using region-based generation and spatial-aware 3D inpainting techniques.

View Paper