MineTheGap: Automatic Mining of Biases in Text-to-Image Models
Noa Cohen, Nurit Spingarn-Eliezer, Inbar Huberman-Spiegelglas, Tomer Michaeli
2025-12-22
Summary
This paper introduces a new technique called MineTheGap that automatically finds text prompts which cause AI image generators to produce biased results.
What's the problem?
AI models that create images from text sometimes make assumptions and show biases when the text isn't super specific, leading to unfair or limited results, like always showing a certain race for a particular job. This isn't just a fairness issue; it also means you don't get a wide variety of images when you're trying to generate a set of them, which can be frustrating for users.
What's the solution?
The researchers developed MineTheGap, which uses a process similar to evolution – it starts with a bunch of prompts and then repeatedly tweaks them, keeping the ones that consistently reveal biases in the image generator. They measure how biased a prompt is by comparing the images it creates to text descriptions generated by another AI, looking for big differences that suggest the image generator is making unfair assumptions.
Why it matters?
This work is important because it provides a way to proactively identify and address biases in AI image generators. By automatically finding these problematic prompts, developers can work to improve the models and ensure they produce fairer and more diverse images, ultimately leading to a better experience for everyone.
Abstract
Text-to-Image (TTI) models generate images based on text prompts, which often leave certain aspects of the desired image ambiguous. When faced with these ambiguities, TTI models have been shown to exhibit biases in their interpretations. These biases can have societal impacts, e.g., when showing only a certain race for a stated occupation. They can also affect user experience when creating redundancy within a set of generated images instead of spanning diverse possibilities. Here, we introduce MineTheGap - a method for automatically mining prompts that cause a TTI model to generate biased outputs. Our method goes beyond merely detecting bias for a given prompt. Rather, it leverages a genetic algorithm to iteratively refine a pool of prompts, seeking for those that expose biases. This optimization process is driven by a novel bias score, which ranks biases according to their severity, as we validate on a dataset with known biases. For a given prompt, this score is obtained by comparing the distribution of generated images to the distribution of LLM-generated texts that constitute variations on the prompt. Code and examples are available on the project's webpage.