DreamDPO: Aligning Text-to-3D Generation with Human Preferences via Direct Preference Optimization

Zhenglin Zhou, Xiaobo Xia, Fan Ma, Hehe Fan, Yi Yang, Tat-Seng Chua

2025-02-11

DreamDPO: Aligning Text-to-3D Generation with Human Preferences via
Direct Preference Optimization

Summary

This paper talks about DreamDPO, a new way to create 3D images from text descriptions that better matches what people actually want and like.

What's the problem?

Current methods for turning text into 3D images often don't do a great job of creating what people really want or expect. This makes these tools less useful and harder to use for different purposes.

What's the solution?

The researchers created DreamDPO, which uses a special process to understand what people prefer. It works by making pairs of 3D images, comparing them to see which one people like better, and then using that information to make better 3D images. This method doesn't need super precise ratings of each image, just an idea of which one is preferred.

Why it matters?

This matters because it could make 3D image creation tools much more useful and easy to use. It could help in fields like video game design, movie special effects, or even in education where 3D models can help explain complex ideas. By making 3D images that better match what people want, these tools could become more widely used in many different areas.

Abstract

Text-to-3D generation automates 3D content creation from textual descriptions, which offers transformative potential across various fields. However, existing methods often struggle to align generated content with human preferences, limiting their applicability and flexibility. To address these limitations, in this paper, we propose DreamDPO, an optimization-based framework that integrates human preferences into the 3D generation process, through direct preference optimization. Practically, DreamDPO first constructs pairwise examples, then compare their alignment with human preferences using reward or large multimodal models, and lastly optimizes the 3D representation with a preference-driven loss function. By leveraging pairwise comparison to reflect preferences, DreamDPO reduces reliance on precise pointwise quality evaluations while enabling fine-grained controllability through preference-guided optimization. Experiments demonstrate that DreamDPO achieves competitive results, and provides higher-quality and more controllable 3D content compared to existing methods. The code and models will be open-sourced.

View Paper