Grounded Misunderstandings in Asymmetric Dialogue: A Perspectivist Annotation Scheme for MapTask

Nan Li, Albert Gatt, Massimo Poesio

2025-11-06

Grounded Misunderstandings in Asymmetric Dialogue: A Perspectivist Annotation Scheme for MapTask

Summary

This research investigates how people truly understand each other during conversations, specifically when one person has more knowledge or control than the other. It focuses on how we build shared understanding, and how that understanding can *seem* complete even when people are actually thinking about different things.

What's the problem?

When people talk, they assume they're on the same page, but sometimes they aren't. This is especially true when one person knows more about a topic than the other. The problem is figuring out how to identify these hidden misunderstandings, where people *think* they agree but are actually referring to different objects or ideas. It's hard to study because we usually only see what people *say*, not what they're actually thinking and understanding.

What's the solution?

The researchers created a way to analyze existing conversation recordings (specifically, a dataset of people giving directions on a map). They had a computer system, powered by a large language model, carefully examine each time someone referred to something – like a landmark on the map. The system didn't just record *what* was said, but also tried to determine what the speaker and the listener each understood at that moment, even if those understandings differed. This created a detailed record of how understanding evolved, or broke down, during the conversation.

Why it matters?

This work is important because it gives us a better way to understand how misunderstandings happen in real-life conversations. It also provides a valuable resource for testing how well artificial intelligence (like chatbots) can handle the complexities of human communication, especially the need to understand different perspectives and ensure true shared understanding. Ultimately, it helps us build AI that can communicate more effectively with people.

Abstract

Collaborative dialogue relies on participants incrementally establishing common ground, yet in asymmetric settings they may believe they agree while referring to different entities. We introduce a perspectivist annotation scheme for the HCRC MapTask corpus (Anderson et al., 1991) that separately captures speaker and addressee grounded interpretations for each reference expression, enabling us to trace how understanding emerges, diverges, and repairs over time. Using a scheme-constrained LLM annotation pipeline, we obtain 13k annotated reference expressions with reliability estimates and analyze the resulting understanding states. The results show that full misunderstandings are rare once lexical variants are unified, but multiplicity discrepancies systematically induce divergences, revealing how apparent grounding can mask referential misalignment. Our framework provides both a resource and an analytic lens for studying grounded misunderstanding and for evaluating (V)LLMs' capacity to model perspective-dependent grounding in collaborative dialogue.

View Paper