Voxlect: A Speech Foundation Model Benchmark for Modeling Dialects and Regional Languages Around the Globe

Tiantian Feng, Kevin Huang, Anfeng Xu, Xuan Shi, Thanathai Lertpetchpun, Jihwan Lee, Yoonjeong Lee, Dani Byrd, Shrikanth Narayanan

2025-08-05

Voxlect: A Speech Foundation Model Benchmark for Modeling Dialects and
Regional Languages Around the Globe

Summary

This paper talks about Voxlect, a new benchmark designed to test how well speech AI models can recognize and work with different dialects and regional languages from around the world.

What's the problem?

The problem is that most speech AI models struggle to understand or correctly classify the many dialects and regional variations of languages, which limits their effectiveness in real-world applications.

What's the solution?

Voxlect solves this by collecting over 2 million speech samples from 30 different publicly available datasets with detailed dialect information, then using these to evaluate and improve existing speech models' abilities to classify and process dialects under various conditions.

Why it matters?

This matters because better understanding and handling of dialects and regional languages in speech AI helps improve technologies like speech recognition and generation, making voice assistants and other tools more accurate and accessible to people everywhere.

Abstract

Voxlect is a benchmark for evaluating speech foundation models on dialect classification and downstream applications across multiple languages and dialects.

View Paper