Finding Dori: Memorization in Text-to-Image Diffusion Models Is Less Local Than Assumed

Antoni Kowalczuk, Dominik Hintersdorf, Lukas Struppek, Kristian Kersting, Adam Dziedzic, Franziska Boenisch

2025-07-24

Finding Dori: Memorization in Text-to-Image Diffusion Models Is Less
Local Than Assumed

Summary

This paper talks about how text-to-image diffusion models sometimes memorize parts of their training images and reproduce them, which can be a problem for privacy and originality.

What's the problem?

Current attempts to stop these models from copying training images by removing parts of the model are not enough because even small changes to the text inputs can cause the model to remember and recreate these images again.

What's the solution?

The researchers discovered that memorization is spread across the model in a more complex way than thought, so simple pruning isn’t sufficient. They showed that better methods are needed to truly eliminate memorized content from the models.

Why it matters?

This matters because addressing unwanted memorization is important for protecting copyright, privacy, and ensuring that AI-generated images are original and safe to use.

Abstract

Pruning-based defenses in text-to-image diffusion models are insufficient as minor adjustments to text embeddings can re-trigger data replication, necessitating methods that truly remove memorized content.

View Paper