Adapting Automatic Speech Recognition for Accented Air Traffic Control Communications

Marcus Yu Zhe Wee, Justin Juin Hng Wong, Lynus Lim, Joe Yu Wei Tan, Prannaya Gupta, Dillion Lim, En Hao Tew, Aloysius Keng Siew Han, Yong Zhi Lim

2025-02-27

Adapting Automatic Speech Recognition for Accented Air Traffic Control
Communications

Summary

This paper talks about improving automatic speech recognition (ASR) systems to better understand Southeast Asian accents in air traffic control communications. The researchers created a special dataset and trained AI models to recognize these accents more accurately.

What's the problem?

Current speech recognition systems have trouble understanding Southeast Asian accents, especially in noisy air traffic control environments. This is a big safety concern because clear communication is crucial for aviation safety.

What's the solution?

The researchers made a new dataset with Southeast Asian-accented speech from air traffic control conversations. They then used this dataset to fine-tune existing speech recognition models, making them better at understanding these specific accents. They also developed techniques to help the models work better in noisy conditions.

Why it matters?

This research matters because it makes air traffic control communications safer and more efficient for Southeast Asian speakers. It could help prevent misunderstandings that might lead to accidents. The improved speech recognition could also be useful in other areas where understanding different accents is important, like in international business or emergency services. By showing how to create region-specific datasets and train models for particular accents, this work provides a blueprint for making speech recognition technology more inclusive and accurate for people from all parts of the world.

Abstract

Effective communication in Air Traffic Control (ATC) is critical to maintaining aviation safety, yet the challenges posed by accented English remain largely unaddressed in Automatic Speech Recognition (ASR) systems. Existing models struggle with transcription accuracy for Southeast Asian-accented (SEA-accented) speech, particularly in noisy ATC environments. This study presents the development of ASR models fine-tuned specifically for Southeast Asian accents using a newly created dataset. Our research achieves significant improvements, achieving a Word Error Rate (WER) of 0.0982 or 9.82% on SEA-accented ATC speech. Additionally, the paper highlights the importance of region-specific datasets and accent-focused training, offering a pathway for deploying ASR systems in resource-constrained military operations. The findings emphasize the need for noise-robust training techniques and region-specific datasets to improve transcription accuracy for non-Western accents in ATC communications.

View Paper