DRAGON: Distributional Rewards Optimize Diffusion Generative Models

Yatong Bai, Jonah Casebeer, Somayeh Sojoudi, Nicholas J. Bryan

2025-04-22

DRAGON: Distributional Rewards Optimize Diffusion Generative Models

Summary

This paper talks about DRAGON, a new way to improve AI models that create images or other content by using a smarter reward system during training, making the results look better to people.

What's the problem?

The problem is that most generative AI models need a lot of feedback from humans about what looks good or not, which takes a lot of time and effort. Even then, the models might not always create content that people actually like, because the feedback system is too simple or limited.

What's the solution?

The researchers developed DRAGON, which uses a system called distributional rewards. Instead of relying on tons of human ratings, DRAGON gives the model more detailed and flexible feedback about how good its creations are. This helps the AI learn faster and produce higher-quality results that match what people want, without needing as much human input.

Why it matters?

This matters because it makes it easier and faster to train AI to generate images, music, or other creative content that people actually enjoy, which is important for art, entertainment, and many other fields where creativity and quality matter.

Abstract

DRAGON is a flexible framework for fine-tuning generative models using distributional rewards, outperforming traditional methods in optimizing human-perceived quality without extensive preference annotations.

View Paper