Multimodal Situational Safety

Kaiwen Zhou, Chengzhi Liu, Xuandong Zhao, Anderson Compalas, Dawn Song, Xin Eric Wang

2024-10-10

Summary

This paper discusses Multimodal Situational Safety, a new safety challenge for multimodal large language models (MLLMs) that evaluates their ability to assess safety in different situations based on visual and language inputs.

What's the problem?

As MLLMs become more advanced and capable of interacting with both humans and their environments, there are growing concerns about their safety. Current systems often fail to consider how the context of a situation—like the visual environment—affects the safety of their responses. This can lead to unsafe or inappropriate actions based on user queries.

What's the solution?

To address this issue, the authors developed a benchmark called MSSBench, which includes 1,820 pairs of language queries and images. Half of these pairs depict safe situations, while the other half show unsafe contexts. The benchmark allows researchers to evaluate how well MLLMs can understand and respond safely to queries based on the visual context. The study also explores multi-agent systems that work together to improve safety in responses.

Why it matters?

This research is significant because it highlights the need for MLLMs to consider situational safety in their interactions. By creating a framework for evaluating how well these models assess safety based on context, the study aims to improve the reliability of AI systems in real-world applications, ensuring they act safely and appropriately in various scenarios.

Abstract

Multimodal Large Language Models (MLLMs) are rapidly evolving, demonstrating impressive capabilities as multimodal assistants that interact with both humans and their environments. However, this increased sophistication introduces significant safety concerns. In this paper, we present the first evaluation and analysis of a novel safety challenge termed Multimodal Situational Safety, which explores how safety considerations vary based on the specific situation in which the user or agent is engaged. We argue that for an MLLM to respond safely, whether through language or action, it often needs to assess the safety implications of a language query within its corresponding visual context. To evaluate this capability, we develop the Multimodal Situational Safety benchmark (MSSBench) to assess the situational safety performance of current MLLMs. The dataset comprises 1,820 language query-image pairs, half of which the image context is safe, and the other half is unsafe. We also develop an evaluation framework that analyzes key safety aspects, including explicit safety reasoning, visual understanding, and, crucially, situational safety reasoning. Our findings reveal that current MLLMs struggle with this nuanced safety problem in the instruction-following setting and struggle to tackle these situational safety challenges all at once, highlighting a key area for future research. Furthermore, we develop multi-agent pipelines to coordinately solve safety challenges, which shows consistent improvement in safety over the original MLLM response. Code and data: mssbench.github.io.

View Paper