BoostMVSNeRFs: Boosting MVS-based NeRFs to Generalizable View Synthesis in Large-scale Scenes

Chih-Hai Su, Chih-Yao Hu, Shr-Ruei Tsai, Jie-Ying Lee, Chin-Yang Lin, Yu-Lun Liu

2024-07-23

BoostMVSNeRFs: Boosting MVS-based NeRFs to Generalizable View Synthesis in Large-scale Scenes

Summary

This paper discusses BoostMVSNeRFs, a new method designed to improve the quality of 3D rendering in large-scale scenes using a technique called Multi-View Stereo (MVS) combined with Neural Radiance Fields (NeRFs). The goal is to enhance how well AI can create realistic images from different viewpoints.

What's the problem?

While Neural Radiance Fields (NeRFs) produce high-quality images, they take a long time to train. On the other hand, MVS-based NeRFs can reduce training time but often sacrifice image quality. This creates a challenge in achieving both fast training and high-quality rendering, especially in large and complex scenes.

What's the solution?

The authors introduce BoostMVSNeRFs to tackle these issues by improving the way MVS-based NeRFs render images. They identify problems such as limited coverage of the scene and visual artifacts caused by not having enough input views. BoostMVSNeRFs selects and combines multiple cost volumes during rendering, which helps capture more detailed information from different angles without needing extensive retraining. This method can adapt to any existing MVS-based NeRF model and allows for fine-tuning on specific scenes.

Why it matters?

This research is significant because it enhances the ability of AI to generate realistic images from videos or 3D models in a more efficient way. By improving the rendering quality without requiring long training times, BoostMVSNeRFs can be applied in various fields such as gaming, virtual reality, and film production, where high-quality visuals are essential.

Abstract

While Neural Radiance Fields (NeRFs) have demonstrated exceptional quality, their protracted training duration remains a limitation. Generalizable and MVS-based NeRFs, although capable of mitigating training time, often incur tradeoffs in quality. This paper presents a novel approach called BoostMVSNeRFs to enhance the rendering quality of MVS-based NeRFs in large-scale scenes. We first identify limitations in MVS-based NeRF methods, such as restricted viewport coverage and artifacts due to limited input views. Then, we address these limitations by proposing a new method that selects and combines multiple cost volumes during volume rendering. Our method does not require training and can adapt to any MVS-based NeRF methods in a feed-forward fashion to improve rendering quality. Furthermore, our approach is also end-to-end trainable, allowing fine-tuning on specific scenes. We demonstrate the effectiveness of our method through experiments on large-scale datasets, showing significant rendering quality improvements in large-scale scenes and unbounded outdoor scenarios. We release the source code of BoostMVSNeRFs at https://su-terry.github.io/BoostMVSNeRFs/.

View Paper