Aligned but Stereotypical? The Hidden Influence of System Prompts on Social Bias in LVLM-Based Text-to-Image Models

NaHyeon Park, Namin An, Kunhee Kim, Soyeon Yoon, Jiahao Huo, Hyunjung Shim

2025-12-05

Aligned but Stereotypical? The Hidden Influence of System Prompts on Social Bias in LVLM-Based Text-to-Image Models

Summary

This research investigates whether advanced AI image generators, specifically those using large vision-language models, unintentionally create images that reflect and even worsen existing societal biases.

What's the problem?

AI image generators are becoming incredibly popular, but there's a concern that they might be reinforcing harmful stereotypes. The study found that these newer, more complex AI models actually produce *more* biased images compared to older models. The core issue is that the instructions given to these AI models, called 'system prompts,' seem to be subtly pushing them to create images based on pre-existing, potentially unfair, assumptions about people and demographics.

What's the solution?

The researchers developed a method called FairPro that doesn't require retraining the AI model itself. Instead, FairPro acts like a 'self-audit' system, allowing the AI to examine its own instructions and rewrite them to be more fair. It essentially helps the AI create better prompts that avoid encoding those biased assumptions. They tested this on two different image generators and found it significantly reduced bias in the generated images while still keeping the images relevant to the text descriptions.

Why it matters?

This work is important because it highlights a hidden source of bias in AI image generation – the system prompts. It provides a practical way to make these powerful tools more responsible and equitable, ensuring they don't perpetuate harmful stereotypes and contribute to unfair representations in the images they create. It offers a way to improve these systems without needing to completely rebuild them, making it easier to implement in real-world applications.

Abstract

Large vision-language model (LVLM) based text-to-image (T2I) systems have become the dominant paradigm in image generation, yet whether they amplify social biases remains insufficiently understood. In this paper, we show that LVLM-based models produce markedly more socially biased images than non-LVLM-based models. We introduce a 1,024 prompt benchmark spanning four levels of linguistic complexity and evaluate demographic bias across multiple attributes in a systematic manner. Our analysis identifies system prompts, the predefined instructions guiding LVLMs, as a primary driver of biased behavior. Through decoded intermediate representations, token-probability diagnostics, and embedding-association analyses, we reveal how system prompts encode demographic priors that propagate into image synthesis. To this end, we propose FairPro, a training-free meta-prompting framework that enables LVLMs to self-audit and construct fairness-aware system prompts at test time. Experiments on two LVLM-based T2I models, SANA and Qwen-Image, show that FairPro substantially reduces demographic bias while preserving text-image alignment. We believe our findings provide deeper insight into the central role of system prompts in bias propagation and offer a practical, deployable approach for building more socially responsible T2I systems.

View Paper