DI-PCG: Diffusion-based Efficient Inverse Procedural Content Generation for High-quality 3D Asset Creation

Wang Zhao, Yan-Pei Cao, Jiale Xu, Yuejiang Dong, Ying Shan

2024-12-20

DI-PCG: Diffusion-based Efficient Inverse Procedural Content Generation for High-quality 3D Asset Creation

Summary

This paper presents DI-PCG, a new method for creating high-quality 3D assets using a technique called Inverse Procedural Content Generation (PCG). It focuses on efficiently generating the right parameters needed to create desired 3D shapes from images.

What's the problem?

Creating 3D models using procedural content generation can be challenging because it often requires a lot of adjustments to parameters to get the right shapes. Existing methods either take too long or don't allow for enough control over the final output, making it hard to achieve the desired results.

What's the solution?

DI-PCG introduces a lightweight diffusion transformer model that simplifies this process. Instead of starting from random noise, it uses images as conditions to directly generate the necessary parameters for creating 3D models. This method is efficient, requiring less training time and fewer resources while still producing accurate results. The authors conducted experiments that showed DI-PCG performs well in generating 3D assets from various images.

Why it matters?

This research is significant because it makes it easier and faster to create detailed 3D models, which are essential in fields like gaming, animation, and virtual reality. By improving the efficiency of generating 3D assets, DI-PCG can help artists and developers save time and resources while enhancing the quality of their work.

Abstract

Procedural Content Generation (PCG) is powerful in creating high-quality 3D contents, yet controlling it to produce desired shapes is difficult and often requires extensive parameter tuning. Inverse Procedural Content Generation aims to automatically find the best parameters under the input condition. However, existing sampling-based and neural network-based methods still suffer from numerous sample iterations or limited controllability. In this work, we present DI-PCG, a novel and efficient method for Inverse PCG from general image conditions. At its core is a lightweight diffusion transformer model, where PCG parameters are directly treated as the denoising target and the observed images as conditions to control parameter generation. DI-PCG is efficient and effective. With only 7.6M network parameters and 30 GPU hours to train, it demonstrates superior performance in recovering parameters accurately, and generalizing well to in-the-wild images. Quantitative and qualitative experiment results validate the effectiveness of DI-PCG in inverse PCG and image-to-3D generation tasks. DI-PCG offers a promising approach for efficient inverse PCG and represents a valuable exploration step towards a 3D generation path that models how to construct a 3D asset using parametric models.

View Paper