MVGS: Multi-view-regulated Gaussian Splatting for Novel View Synthesis

Xiaobiao Du, Yida Wang, Xin Yu

2024-10-04

MVGS: Multi-view-regulated Gaussian Splatting for Novel View Synthesis

Summary

This paper introduces MVGS, a new method that improves the process of creating 3D images from multiple viewpoints using a technique called Gaussian Splatting, which helps generate more accurate and detailed visuals.

What's the problem?

Current methods for rendering 3D images often rely on training with only one viewpoint at a time. This can lead to problems where the model becomes too focused on specific views, resulting in poor quality when generating new perspectives. As a result, the images produced may not look realistic or may lack important details, making it difficult to create accurate 3D representations.

What's the solution?

To address these issues, the authors propose a multi-view training strategy that allows the model to learn from multiple perspectives simultaneously. This approach helps the model understand the overall structure and appearance of the scene better. They also introduce several techniques, including cross-intrinsic guidance and cross-ray densification, which enhance how the model processes information from different views. By optimizing the training process this way, MVGS significantly improves the quality of the generated 3D images.

Why it matters?

This research is important because it enhances the ability of AI systems to create realistic 3D images from various angles, which is crucial for applications in fields like virtual reality, video games, and medical imaging. By improving how these models work, MVGS can lead to better visual experiences and more accurate representations of complex scenes.

Abstract

Recent works in volume rendering, e.g. NeRF and 3D Gaussian Splatting (3DGS), significantly advance the rendering quality and efficiency with the help of the learned implicit neural radiance field or 3D Gaussians. Rendering on top of an explicit representation, the vanilla 3DGS and its variants deliver real-time efficiency by optimizing the parametric model with single-view supervision per iteration during training which is adopted from NeRF. Consequently, certain views are overfitted, leading to unsatisfying appearance in novel-view synthesis and imprecise 3D geometries. To solve aforementioned problems, we propose a new 3DGS optimization method embodying four key novel contributions: 1) We transform the conventional single-view training paradigm into a multi-view training strategy. With our proposed multi-view regulation, 3D Gaussian attributes are further optimized without overfitting certain training views. As a general solution, we improve the overall accuracy in a variety of scenarios and different Gaussian variants. 2) Inspired by the benefit introduced by additional views, we further propose a cross-intrinsic guidance scheme, leading to a coarse-to-fine training procedure concerning different resolutions. 3) Built on top of our multi-view regulated training, we further propose a cross-ray densification strategy, densifying more Gaussian kernels in the ray-intersect regions from a selection of views. 4) By further investigating the densification strategy, we found that the effect of densification should be enhanced when certain views are distinct dramatically. As a solution, we propose a novel multi-view augmented densification strategy, where 3D Gaussians are encouraged to get densified to a sufficient number accordingly, resulting in improved reconstruction accuracy.

View Paper