iLRM: An Iterative Large 3D Reconstruction Model

Gyeongjin Kang, Seungtae Nam, Xiangyu Sun, Sameh Khamis, Abdelrahman Mohamed, Eunbyung Park

2025-08-01

iLRM: An Iterative Large 3D Reconstruction Model

Summary

This paper talks about iLRM, a new model designed to make creating 3D reconstructions from images more scalable and efficient by breaking down the way scenes are represented and using a special two-step attention method along with adding high-resolution details.

What's the problem?

The problem is that building accurate 3D models from images or video can be very slow and costly, especially for large or complex scenes, because current methods often struggle with handling all the details and relationships efficiently.

What's the solution?

iLRM solves this by separating how the scene is represented into simpler parts, applying attention in two stages to focus on different aspects step-by-step, and injecting detailed information gradually. This makes the process faster and easier to manage while still producing high-quality 3D reconstructions.

Why it matters?

This matters because efficient and scalable 3D reconstruction helps in many fields like virtual reality, robotics, and mapping, making it possible to create detailed digital worlds and environments more quickly and with less computing power.

Abstract

iLRM, an iterative Large 3D Reconstruction Model, improves scalability and efficiency in 3D reconstruction by decoupling scene representation, using a two-stage attention scheme, and injecting high-resolution information.

View Paper