MVPaint: Synchronized Multi-View Diffusion for Painting Anything 3D

Wei Cheng, Juncheng Mu, Xianfang Zeng, Xin Chen, Anqi Pang, Chi Zhang, Zhibin Wang, Bin Fu, Gang Yu, Ziwei Liu, Liang Pan

2024-11-05

MVPaint: Synchronized Multi-View Diffusion for Painting Anything 3D

Summary

This paper presents MVPaint, a new framework for creating high-quality 3D textures that are consistent across different views. It aims to improve the texturing process in 3D asset production by addressing common issues found in existing methods.

What's the problem?

Texturing is a vital step in making 3D models look realistic, but current techniques often produce poor results. These methods can create textures that are inconsistent when viewed from different angles, have unpainted areas, and heavily depend on how well the 3D model's surface is prepared (known as UV unwrapping). This leads to textures that don't look good or seamless.

What's the solution?

MVPaint tackles these problems with a three-part approach: First, it uses Synchronized Multi-view Generation (SMG) to generate images from multiple angles at once, which helps create a rough texture. Second, it employs Spatial-aware 3D Inpainting (S3I) to fill in any missing texture areas. Lastly, it includes a UV Refinement (UVR) step to enhance the quality of the textures and fix any inconsistencies caused during the UV unwrapping process. The authors also created two evaluation benchmarks to test their method against existing techniques.

Why it matters?

This research is important because it significantly improves how textures are applied to 3D models, making them look more realistic and visually appealing. By solving the issues of inconsistency and quality in texturing, MVPaint can benefit industries like gaming, animation, and virtual reality, where high-quality graphics are essential.

Abstract

Texturing is a crucial step in the 3D asset production workflow, which enhances the visual appeal and diversity of 3D assets. Despite recent advancements in Text-to-Texture (T2T) generation, existing methods often yield subpar results, primarily due to local discontinuities, inconsistencies across multiple views, and their heavy dependence on UV unwrapping outcomes. To tackle these challenges, we propose a novel generation-refinement 3D texturing framework called MVPaint, which can generate high-resolution, seamless textures while emphasizing multi-view consistency. MVPaint mainly consists of three key modules. 1) Synchronized Multi-view Generation (SMG). Given a 3D mesh model, MVPaint first simultaneously generates multi-view images by employing an SMG model, which leads to coarse texturing results with unpainted parts due to missing observations. 2) Spatial-aware 3D Inpainting (S3I). To ensure complete 3D texturing, we introduce the S3I method, specifically designed to effectively texture previously unobserved areas. 3) UV Refinement (UVR). Furthermore, MVPaint employs a UVR module to improve the texture quality in the UV space, which first performs a UV-space Super-Resolution, followed by a Spatial-aware Seam-Smoothing algorithm for revising spatial texturing discontinuities caused by UV unwrapping. Moreover, we establish two T2T evaluation benchmarks: the Objaverse T2T benchmark and the GSO T2T benchmark, based on selected high-quality 3D meshes from the Objaverse dataset and the entire GSO dataset, respectively. Extensive experimental results demonstrate that MVPaint surpasses existing state-of-the-art methods. Notably, MVPaint could generate high-fidelity textures with minimal Janus issues and highly enhanced cross-view consistency.

View Paper