MoHoBench: Assessing Honesty of Multimodal Large Language Models via Unanswerable Visual Questions
Yanxu Zhu, Shitong Duan, Xiangxu Zhang, Jitao Sang, Peng Zhang, Tun Lu, Xiao Zhou, Jing Yao, Xiaoyuan Yi, Xing Xie
2025-07-30
Summary
This paper talks about MoHoBench, a new test that checks how honest multimodal large language models (MLLMs) are when answering questions about images, especially when the questions cannot be answered from the picture.
What's the problem?
The problem is that these AI models often try to guess answers even when the question is unanswerable based on the image, instead of saying they don't know or refusing to answer. This can lead to false or misleading information.
What's the solution?
MoHoBench solves this by creating a large set of tricky visual questions designed to be unanswerable and using it to evaluate whether models can recognize when they should refuse to answer. The benchmark helps highlight the need for better training methods to improve the honesty and reliability of MLLMs.
Why it matters?
This matters because making AI models that are honest about what they know or don’t know is crucial for building trust and preventing the spread of wrong information, especially when AI systems are used in real-world decision-making.
Abstract
A systematic assessment of honesty in Multimodal Large Language Models (MLLMs) using a large-scale benchmark reveals that models often fail to appropriately refuse unanswerable visual questions, highlighting the need for multimodal honesty alignment methods.