Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models

Jay Zhangjie Wu, Yuxuan Zhang, Haithem Turki, Xuanchi Ren, Jun Gao, Mike Zheng Shou, Sanja Fidler, Zan Gojcic, Huan Ling

2025-03-04

Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models

Summary

This paper talks about Difix3D+, a new method to improve 3D reconstructions and create better views of 3D scenes using AI technology called diffusion models.

What's the problem?

Current 3D reconstruction methods, like Neural Radiance Fields and 3D Gaussian Splatting, are good but still have issues when trying to create realistic views from unusual angles. They often produce artifacts or errors in the images, especially in areas where there's not enough information.

What's the solution?

The researchers created Difix3D+, which uses a special AI model called Difix to clean up and improve 3D reconstructions. Difix works in two ways: first, it improves the initial 3D model by cleaning up practice views, and second, it enhances the final images to remove any remaining errors. This system works with different types of 3D reconstruction methods and can significantly improve the quality of the resulting images.

Why it matters?

This matters because it could lead to more realistic and accurate 3D models and views, which is important for things like virtual reality, video games, and scientific visualization. The improvements made by Difix3D+ could make these technologies look much better and more lifelike, enhancing user experiences across various fields.

Abstract

Neural Radiance Fields and 3D Gaussian Splatting have revolutionized 3D reconstruction and novel-view synthesis task. However, achieving photorealistic rendering from extreme novel viewpoints remains challenging, as artifacts persist across representations. In this work, we introduce <PRE_TAG>Difix3D+</POST_TAG>, a novel pipeline designed to enhance 3D reconstruction and novel-view synthesis through single-step diffusion models. At the core of our approach is Difix, a single-step image diffusion model trained to enhance and remove artifacts in rendered novel views caused by underconstrained regions of the 3D representation. Difix serves two critical roles in our pipeline. First, it is used during the reconstruction phase to clean up pseudo-training views that are rendered from the reconstruction and then distilled back into 3D. This greatly enhances underconstrained regions and improves the overall 3D representation quality. More importantly, Difix also acts as a neural enhancer during inference, effectively removing residual artifacts arising from imperfect 3D supervision and the limited capacity of current reconstruction models. <PRE_TAG>Difix3D+</POST_TAG> is a general solution, a single model compatible with both NeRF and 3DGS representations, and it achieves an average 2times improvement in FID score over baselines while maintaining 3D consistency.

View Paper