Rectified Diffusion: Straightness Is Not Your Need in Rectified Flow
Fu-Yun Wang, Ling Yang, Zhaoyang Huang, Mengdi Wang, Hongsheng Li
2024-10-13

Summary
This paper discusses a new approach called Rectified Diffusion, which improves how generative models create images by allowing for more flexible paths during the image generation process.
What's the problem?
Generative models, particularly diffusion models, have become popular for creating images. However, they often generate images slowly because they rely on solving complex equations. Existing methods, like rectified flow, try to speed up this process by making the path of image generation straight. But this straightness isn't always necessary and can limit the model's performance.
What's the solution?
The authors propose Rectified Diffusion, which argues that the focus should not be on making the generation path straight but rather on achieving a more natural, curved path that better reflects real-world data. They show that using a pretrained diffusion model to match noise and sample pairs is more effective than relying on strict straightness. This new method simplifies the training process and allows for better image quality without the constraints of previous models.
Why it matters?
This research is important because it provides a new way to enhance generative models, making them faster and more efficient while producing higher-quality images. By relaxing the requirement for straight paths in image generation, Rectified Diffusion opens up new possibilities for creating more realistic and diverse images in various applications like art, design, and virtual reality.
Abstract
Diffusion models have greatly improved visual generation but are hindered by slow generation speed due to the computationally intensive nature of solving generative ODEs. Rectified flow, a widely recognized solution, improves generation speed by straightening the ODE path. Its key components include: 1) using the diffusion form of flow-matching, 2) employing boldsymbol v-prediction, and 3) performing rectification (a.k.a. reflow). In this paper, we argue that the success of rectification primarily lies in using a pretrained diffusion model to obtain matched pairs of noise and samples, followed by retraining with these matched noise-sample pairs. Based on this, components 1) and 2) are unnecessary. Furthermore, we highlight that straightness is not an essential training target for rectification; rather, it is a specific case of flow-matching models. The more critical training target is to achieve a first-order approximate ODE path, which is inherently curved for models like DDPM and Sub-VP. Building on this insight, we propose Rectified Diffusion, which generalizes the design space and application scope of rectification to encompass the broader category of diffusion models, rather than being restricted to flow-matching models. We validate our method on Stable Diffusion v1-5 and Stable Diffusion XL. Our method not only greatly simplifies the training procedure of rectified flow-based previous works (e.g., InstaFlow) but also achieves superior performance with even lower training cost. Our code is available at https://github.com/G-U-N/Rectified-Diffusion.