Mitigating the Noise Shift for Denoising Generative Models via Noise Awareness Guidance

Jincheng Zhong, Boyuan Jiang, Xin Tao, Pengfei Wan, Kun Gai, Mingsheng Long

2025-10-15

Mitigating the Noise Shift for Denoising Generative Models via Noise Awareness Guidance

Summary

This paper investigates a problem with how current image generation models, called diffusion models, handle noise during the image creation process, and proposes a way to fix it to get better results.

What's the problem?

Diffusion models work by gradually removing noise from a random starting point to create an image. The researchers found that these models often misjudge how much noise is actually present at each step, creating a 'noise shift'. This mismatch causes the model to make inaccurate adjustments, leading to images that aren't as high quality or don't quite match what the model was trained to create, especially when dealing with images it hasn't seen before.

What's the solution?

To solve this, the researchers developed a technique called Noise Awareness Guidance (NAG). NAG essentially guides the model to pay closer attention to the actual noise levels during image generation, correcting the 'noise shift' and keeping the process on track. They also created a version of NAG that doesn't require a separate 'classifier' to work, making it simpler to implement by training the model to understand noise levels directly through a special dropout method.

Why it matters?

This research is important because it identifies a fundamental flaw in many popular image generation models and provides a straightforward way to improve their performance. By fixing the 'noise shift', the models can create higher-quality images and perform better in tasks like fine-tuning for specific styles or subjects, ultimately making these powerful tools more reliable and useful.

Abstract

Existing denoising generative models rely on solving discretized reverse-time SDEs or ODEs. In this paper, we identify a long-overlooked yet pervasive issue in this family of models: a misalignment between the pre-defined noise level and the actual noise level encoded in intermediate states during sampling. We refer to this misalignment as noise shift. Through empirical analysis, we demonstrate that noise shift is widespread in modern diffusion models and exhibits a systematic bias, leading to sub-optimal generation due to both out-of-distribution generalization and inaccurate denoising updates. To address this problem, we propose Noise Awareness Guidance (NAG), a simple yet effective correction method that explicitly steers sampling trajectories to remain consistent with the pre-defined noise schedule. We further introduce a classifier-free variant of NAG, which jointly trains a noise-conditional and a noise-unconditional model via noise-condition dropout, thereby eliminating the need for external classifiers. Extensive experiments, including ImageNet generation and various supervised fine-tuning tasks, show that NAG consistently mitigates noise shift and substantially improves the generation quality of mainstream diffusion models.

View Paper