Investigating Hallucination in Conversations for Low Resource Languages

Amit Das, Md. Najib Hasan, Souvika Sarkar, Zheng Zhang, Fatemeh Jamshidi, Tathagata Bhattacharya, Nilanjana Raychawdhury, Dongji Feng, Vinija Jain, Aman Chadha

2025-08-04

Investigating Hallucination in Conversations for Low Resource Languages

Summary

This paper talks about how large language models (LLMs) make fewer mistakes called hallucinations when generating answers in Mandarin compared to languages like Hindi and Farsi across different models.

What's the problem?

The problem is that LLMs sometimes create false or made-up information that sounds believable, a problem called hallucination. This problem is worse in some languages than others, especially those that have less training data available.

What's the solution?

The paper studies and compares how often hallucinations happen in different low-resource languages like Hindi and Farsi versus Mandarin, showing that Mandarin models tend to hallucinate less, which helps understand language differences and model limitations.

Why it matters?

This matters because knowing which languages have more hallucinations helps researchers improve AI models, making them more reliable and useful for more people speaking different languages.

Abstract

LLMs generate fewer hallucinations in Mandarin compared to Hindi and Farsi across multiple models.

View Paper