Diffusion-Sharpening: Fine-tuning Diffusion Models with Denoising Trajectory Sharpening

Ye Tian, Ling Yang, Xinchen Zhang, Yunhai Tong, Mengdi Wang, Bin Cui

2025-02-18

Diffusion-Sharpening: Fine-tuning Diffusion Models with Denoising
Trajectory Sharpening

Summary

This paper talks about Diffusion-Sharpening, a new way to make AI image generation models better at creating images that match what people ask for. It's like teaching an artist to paint exactly what you want by showing them how to improve their whole painting process, not just individual brush strokes.

What's the problem?

Current methods for improving AI image generators either focus on small parts of the creation process or make the AI work much harder when creating images. This means the AI either doesn't get better at understanding what people really want, or it becomes too slow to be useful.

What's the solution?

The researchers created Diffusion-Sharpening, which looks at the entire image creation process and figures out the best way to make images that match what people ask for. It uses a clever math trick to choose the best 'path' for creating an image during training, and then teaches the AI to follow this path quickly when making new images. This method helps the AI learn faster and create better images without slowing down.

Why it matters?

This matters because it could make AI image generators much better at creating exactly what people want, while still being fast enough to use in real-world applications. Better AI artists could help in fields like design, entertainment, and education by quickly creating custom images that match specific needs. It's a big step towards making AI creativity more useful and accessible to everyone.

Abstract

We propose Diffusion-Sharpening, a fine-tuning approach that enhances downstream alignment by optimizing sampling trajectories. Existing RL-based fine-tuning methods focus on single training timesteps and neglect trajectory-level alignment, while recent sampling trajectory optimization methods incur significant inference NFE costs. Diffusion-Sharpening overcomes this by using a path integral framework to select optimal trajectories during training, leveraging reward feedback, and amortizing inference costs. Our method demonstrates superior training efficiency with faster convergence, and best inference efficiency without requiring additional NFEs. Extensive experiments show that Diffusion-Sharpening outperforms RL-based fine-tuning methods (e.g., Diffusion-DPO) and sampling trajectory optimization methods (e.g., Inference Scaling) across diverse metrics including text alignment, compositional capabilities, and human preferences, offering a scalable and efficient solution for future diffusion model fine-tuning. Code: https://github.com/Gen-Verse/Diffusion-Sharpening

View Paper