RoCoTex: A Robust Method for Consistent Texture Synthesis with Diffusion Models
Jangyeong Kim, Donggoo Kang, Junyoung Choi, Jeonga Wi, Junho Gwon, Jiun Bae, Dumim Yoon, Junghyun Han
2024-10-07

Summary
This paper introduces RoCoTex, a new method for generating consistent and seamless textures from text descriptions using advanced diffusion models.
What's the problem?
Current methods for creating textures from text often face several issues, such as inconsistencies in how the texture looks from different angles, visible seams where textures meet, and misalignment with the 3D shapes they are applied to. These problems can make the generated textures look unrealistic or poorly integrated into 3D models.
What's the solution?
To solve these issues, the authors developed RoCoTex, which uses state-of-the-art 2D diffusion models to create high-quality textures that are well-aligned with the underlying 3D mesh. The method includes a special strategy for synthesizing views and uses regional prompts to enhance consistency across different perspectives. Additionally, RoCoTex employs new techniques for blending textures and soft inpainting to minimize visible seams. Through extensive experiments, the authors showed that their method outperforms existing techniques in generating realistic textures.
Why it matters?
This research is important because it improves the quality of texture generation for 3D models, which is crucial in fields like video game design, animation, and virtual reality. By addressing common problems in texture synthesis, RoCoTex can help create more immersive and visually appealing digital environments.
Abstract
Text-to-texture generation has recently attracted increasing attention, but existing methods often suffer from the problems of view inconsistencies, apparent seams, and misalignment between textures and the underlying mesh. In this paper, we propose a robust text-to-texture method for generating consistent and seamless textures that are well aligned with the mesh. Our method leverages state-of-the-art 2D diffusion models, including SDXL and multiple ControlNets, to capture structural features and intricate details in the generated textures. The method also employs a symmetrical view synthesis strategy combined with regional prompts for enhancing view consistency. Additionally, it introduces novel texture blending and soft-inpainting techniques, which significantly reduce the seam regions. Extensive experiments demonstrate that our method outperforms existing state-of-the-art methods.