ProReflow: Progressive Reflow with Decomposed Velocity
Lei Ke, Haohang Xu, Xuefei Ning, Yu Li, Jiajun Li, Haoling Li, Yuxuan Lin, Dongsheng Jiang, Yujiu Yang, Linfeng Zhang
2025-03-10
Summary
This paper talks about ProReflow, a new method to make AI models faster and more efficient at generating images and videos by improving how they process data during training
What's the problem?
Diffusion models, which are used to create high-quality images and videos, take a lot of time and computing power because they require many steps to generate results. Current methods to speed up this process are not optimized, making it difficult to achieve both speed and quality
What's the solution?
The researchers introduced two techniques: progressive reflow, which simplifies the training process by breaking it into smaller steps before tackling the entire process, and aligned v-prediction, which focuses on matching the direction of data movement rather than its size. These improvements allow the AI to generate high-quality images in fewer steps while maintaining accuracy close to larger models
Why it matters?
This matters because it makes image and video generation much faster and less resource-intensive, which is important for real-time applications like animation or virtual reality. By reducing computation costs, ProReflow could make advanced AI tools more accessible for creative industries and everyday users
Abstract
Diffusion models have achieved significant progress in both image and video generation while still suffering from huge computation costs. As an effective solution, flow matching aims to reflow the diffusion process of diffusion models into a straight line for a few-step and even one-step generation. However, in this paper, we suggest that the original training pipeline of flow matching is not optimal and introduce two techniques to improve it. Firstly, we introduce progressive reflow, which progressively reflows the diffusion models in local timesteps until the whole diffusion progresses, reducing the difficulty of flow matching. Second, we introduce aligned v-prediction, which highlights the importance of direction matching in flow matching over magnitude matching. Experimental results on SDv1.5 and SDXL demonstrate the effectiveness of our method, for example, conducting on SDv1.5 achieves an FID of 10.70 on MSCOCO2014 validation set with only 4 sampling steps, close to our teacher model (32 DDIM steps, FID = 10.05).