Token Perturbation Guidance for Diffusion Models
Javad Rajabi, Soroush Mehraban, Seyedmorteza Sadat, Babak Taati
2025-06-15
Summary
This paper talks about Token Perturbation Guidance (TPG), a new method that improves diffusion models used for generating images. TPG works by slightly changing the internal token representations inside the model during generation to guide it better, without needing any extra training or changes to the model's structure.
What's the problem?
The problem is that the popular guidance method called classifier-free guidance (CFG) improves image quality but requires special training and only works when the model has conditions to follow. Existing training-free methods don't guide the model as well, especially at early stages where the main shapes and structure of the image are formed, causing lower quality results.
What's the solution?
The solution was to create TPG, which applies a special shuffling operation on tokens inside the model to provide clear and stable guidance while generating images. This method preserves important global information but disrupts less important local details in a way that guides the model effectively. Since it doesn’t need training or architectural changes, it can be used for both conditional and unconditional image generation, giving similar benefits to CFG in a more flexible way.
Why it matters?
This matters because TPG offers a simple and powerful way to improve image generation quality in diffusion models without the extra cost and limitations of current guidance methods. It makes advanced image generation more accessible and flexible, helping AI create better images more efficiently in many scenarios.
Abstract
Token Perturbation Guidance (TPG) enhances diffusion models with condition-agnostic, training-free guidance, similar to classifier-free guidance (CFG), without requiring architectural changes.