Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation

Zibo Zhao, Zeqiang Lai, Qingxiang Lin, Yunfei Zhao, Haolin Liu, Shuhui Yang, Yifei Feng, Mingxin Yang, Sheng Zhang, Xianghui Yang, Huiwen Shi, Sicong Liu, Junta Wu, Yihang Lian, Fan Yang, Ruining Tang, Zebin He, Xinzhou Wang, Jian Liu, Xuhui Zuo, Zhuo Chen, Biwen Lei

2025-01-22

Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation

Summary

This paper talks about Hunyuan3D 2.0, a new AI system that can create detailed 3D objects and textures from 2D images. It's like having a super-smart digital artist that can turn a flat picture into a realistic 3D model with just a few clicks.

What's the problem?

Creating high-quality 3D models and textures is usually a time-consuming process that requires a lot of skill and effort from artists. It's especially challenging to make 3D objects that match a specific 2D image accurately. Current AI tools for 3D creation often struggle with getting the details right or making textures that look realistic.

What's the solution?

The researchers developed Hunyuan3D 2.0, which has two main parts: Hunyuan3D-DiT for creating the 3D shape, and Hunyuan3D-Paint for adding realistic textures. They used advanced AI techniques like 'diffusion transformers' to make sure the 3D shapes match the input images closely. They also created Hunyuan3D-Studio, a user-friendly software that lets both professionals and beginners easily work with and animate these 3D models. The team tested their system against other top 3D creation tools and found that Hunyuan3D 2.0 performed better in terms of detail, accuracy, and texture quality.

Why it matters?

This matters because it could revolutionize how 3D content is created for things like video games, movies, and virtual reality. It makes the process of creating complex 3D models much faster and more accessible to people who aren't professional 3D artists. This could lead to more diverse and creative 3D content in various fields. Additionally, by making their code and models publicly available, the researchers are helping the entire 3D creation community to advance and innovate further. This open approach could accelerate progress in 3D technology and make it more widely available for everyone to use and improve upon.

Abstract

We present Hunyuan3D 2.0, an advanced large-scale 3D synthesis system for generating high-resolution textured 3D assets. This system includes two foundation components: a large-scale shape generation model -- Hunyuan3D-DiT, and a large-scale texture synthesis model -- Hunyuan3D-Paint. The shape generative model, built on a scalable flow-based diffusion transformer, aims to create geometry that properly aligns with a given condition image, laying a solid foundation for downstream applications. The texture synthesis model, benefiting from strong geometric and diffusion priors, produces high-resolution and vibrant texture maps for either generated or hand-crafted meshes. Furthermore, we build Hunyuan3D-Studio -- a versatile, user-friendly production platform that simplifies the re-creation process of 3D assets. It allows both professional and amateur users to manipulate or even animate their meshes efficiently. We systematically evaluate our models, showing that Hunyuan3D 2.0 outperforms previous state-of-the-art models, including the open-source models and closed-source models in geometry details, condition alignment, texture quality, and etc. Hunyuan3D 2.0 is publicly released in order to fill the gaps in the open-source 3D community for large-scale foundation generative models. The code and pre-trained weights of our models are available at: https://github.com/Tencent/Hunyuan3D-2

View Paper