Diffusion Models without Classifier-free Guidance

Zhicong Tang, Jianmin Bao, Dong Chen, Baining Guo

2025-02-18

Diffusion Models without Classifier-free Guidance

Summary

This paper talks about Model-guidance (MG), a new training method for diffusion models that removes the need for classifier-free guidance (CFG). It simplifies the process of generating high-quality images while making the models faster and more efficient.

What's the problem?

Diffusion models, which are used to generate images, often rely on a technique called classifier-free guidance (CFG) to control how well the generated images match specific conditions. While CFG works well, it slows down the training and image generation process, making it less efficient.

What's the solution?

The researchers introduced Model-guidance (MG), which changes how diffusion models are trained by incorporating probabilities directly into the model. This eliminates the need for CFG while still achieving or even surpassing its performance. MG speeds up training and doubles the speed of generating images without sacrificing quality. They tested MG on multiple datasets and showed that it outperformed other methods, achieving state-of-the-art results on benchmarks like ImageNet 256.

Why it matters?

This matters because MG makes diffusion models faster and more efficient, which could save time and resources in applications like art creation, design, and scientific visualization. By removing the need for CFG, this method simplifies the process while still delivering top-quality results, making it easier to use these models in real-world scenarios.

Abstract

This paper presents Model-guidance (MG), a novel objective for training diffusion model that addresses and removes of the commonly used Classifier-free guidance (CFG). Our innovative approach transcends the standard modeling of solely data distribution to incorporating the posterior probability of conditions. The proposed technique originates from the idea of CFG and is easy yet effective, making it a plug-and-play module for existing models. Our method significantly accelerates the training process, doubles the inference speed, and achieve exceptional quality that parallel and even surpass concurrent diffusion models with CFG. Extensive experiments demonstrate the effectiveness, efficiency, scalability on different models and datasets. Finally, we establish state-of-the-art performance on ImageNet 256 benchmarks with an FID of 1.34. Our code is available at https://github.com/tzco/Diffusion-wo-CFG.

View Paper