Quantifying Fairness in LLMs Beyond Tokens: A Semantic and Statistical Perspective
Weijie Xu, Yiwen Wang, Chi Xue, Xiangkun Hu, Xi Fang, Guimin Dong, Chandan K. Reddy
2025-06-24
Summary
This paper talks about FiSCo, a method to evaluate fairness in large language models by looking at the meaning and statistics of their long responses to different demographic groups.
What's the problem?
The problem is that language models can treat different groups unfairly, but existing fairness tests mostly focus on small parts of text rather than the full, long responses, missing subtle biases.
What's the solution?
The researchers developed FiSCo, which checks if the model’s long answers are fair by comparing the meanings of responses for different groups using logical tests and statistical methods to find hidden unfairness.
Why it matters?
This matters because it helps create fairer and less biased AI systems by making sure their answers treat all groups equally, especially when the AI gives longer, more complex replies.
Abstract
FiSCo evaluates LLM fairness by detecting semantic differences in long-form responses across demographic groups, using entailment checks and statistical hypothesis testing to identify subtle biases.