DreamCatalyst: Fast and High-Quality 3D Editing via Controlling Editability and Identity Preservation
Jiwook Kim, Seonho Lee, Jaeyo Shin, Jiho Choi, Hyunjung Shim
2024-07-17

Summary
This paper introduces DreamCatalyst, a new framework for fast and high-quality 3D editing that improves the efficiency and quality of editing processes in computer graphics.
What's the problem?
Current methods for editing 3D images, especially those using a technique called score distillation sampling (SDS), take a long time to train and often produce low-quality results. This is mainly because these methods do not align well with how diffusion models work, which are known for their ability to generate consistent 3D images.
What's the solution?
DreamCatalyst solves these issues by reinterpreting SDS-based editing as a reverse process of diffusion. It optimizes the editing process to better match the dynamics of diffusion models, which helps reduce training time and improve the quality of the edited images. DreamCatalyst offers two modes: a faster mode that can edit scenes in about 25 minutes and a high-quality mode that takes around 70 minutes but produces superior results. This framework is designed to outperform existing methods in both speed and quality.
Why it matters?
This research is significant because it provides a more efficient way to edit 3D images, which is important for various applications like video games, movies, and virtual reality. By improving the editing process, DreamCatalyst can help creators produce high-quality visuals more quickly, making advanced graphics technology more accessible.
Abstract
Score distillation sampling (SDS) has emerged as an effective framework in text-driven 3D editing tasks due to its inherent 3D consistency. However, existing SDS-based 3D editing methods suffer from extensive training time and lead to low-quality results, primarily because these methods deviate from the sampling dynamics of diffusion models. In this paper, we propose DreamCatalyst, a novel framework that interprets SDS-based editing as a diffusion reverse process. Our objective function considers the sampling dynamics, thereby making the optimization process of DreamCatalyst an approximation of the diffusion reverse process in editing tasks. DreamCatalyst aims to reduce training time and improve editing quality. DreamCatalyst presents two modes: (1) a faster mode, which edits the NeRF scene in only about 25 minutes, and (2) a high-quality mode, which produces superior results in less than 70 minutes. Specifically, our high-quality mode outperforms current state-of-the-art NeRF editing methods both in terms of speed and quality. See more extensive results on our project page: https://dream-catalyst.github.io.