C3: A Bilingual Benchmark for Spoken Dialogue Models Exploring Challenges in Complex Conversations

Chengqian Ma, Wei Tao, Yiwen Guo

2025-08-01

C3: A Bilingual Benchmark for Spoken Dialogue Models Exploring
Challenges in Complex Conversations

Summary

This paper talks about C3, a new benchmark dataset designed to test how well spoken dialogue models understand and respond in complex conversations in both English and Chinese.

What's the problem?

The problem is that current spoken dialogue models find it hard to deal with tricky parts of conversations, like unclear meanings or relying on what was said earlier, which makes their responses less natural or accurate.

What's the solution?

C3 solves this by providing a large collection of challenging conversation examples in two languages that include these difficult situations, so developers can better evaluate and improve their dialogue models.

Why it matters?

This matters because improving spoken dialogue models helps make voice assistants and chatbots smarter and more reliable, allowing them to understand and respond naturally to people in different languages.

Abstract

A benchmark dataset for Spoken Dialogue Models (SDMs) in English and Chinese is presented to evaluate their performance in understanding and emulating human conversations, addressing challenges like ambiguity and context-dependency.

View Paper