HumanAgencyBench: Scalable Evaluation of Human Agency Support in AI Assistants
Benjamin Sturgeon, Daniel Samuelson, Jacob Haimes, Jacy Reese Anthis
2025-09-11
Summary
This paper explores the risk of losing control over our choices as we rely more on artificial intelligence, and proposes a way to measure how well AI systems support our ability to make our own decisions.
What's the problem?
As AI gets better at making suggestions and even decisions for us, like what to watch online, there's a danger we'll start following those suggestions without really thinking for ourselves, ultimately giving up some of our independence and control over our lives. Current AI systems aren't consistently designed to respect and encourage our own judgment.
What's the solution?
The researchers created a benchmark called HumanAgencyBench, or HAB, to test how well AI assistants support 'human agency' – our ability to act as independent decision-makers. HAB looks at six key areas: whether the AI asks questions to understand what *you* want, avoids pushing its own values onto you, corrects false information, lets you make important choices, encourages you to learn, and respects social boundaries. They then tested several popular AI systems using this benchmark to see how they performed.
Why it matters?
This work is important because it highlights the need to build AI that *helps* us make good decisions, rather than making decisions *for* us. It shows that simply making AI more powerful or better at following instructions isn't enough; we need to specifically design AI to respect our autonomy and support our ability to think for ourselves, and HAB provides a tool to measure and improve this.
Abstract
As humans delegate more tasks and decisions to artificial intelligence (AI), we risk losing control of our individual and collective futures. Relatively simple algorithmic systems already steer human decision-making, such as social media feed algorithms that lead people to unintentionally and absent-mindedly scroll through engagement-optimized content. In this paper, we develop the idea of human agency by integrating philosophical and scientific theories of agency with AI-assisted evaluation methods: using large language models (LLMs) to simulate and validate user queries and to evaluate AI responses. We develop HumanAgencyBench (HAB), a scalable and adaptive benchmark with six dimensions of human agency based on typical AI use cases. HAB measures the tendency of an AI assistant or agent to Ask Clarifying Questions, Avoid Value Manipulation, Correct Misinformation, Defer Important Decisions, Encourage Learning, and Maintain Social Boundaries. We find low-to-moderate agency support in contemporary LLM-based assistants and substantial variation across system developers and dimensions. For example, while Anthropic LLMs most support human agency overall, they are the least supportive LLMs in terms of Avoid Value Manipulation. Agency support does not appear to consistently result from increasing LLM capabilities or instruction-following behavior (e.g., RLHF), and we encourage a shift towards more robust safety and alignment targets.