Benchmarking Chinese Knowledge Rectification in Large Language Models

Tianhe Lu, Jizhan Fang, Yunzhi Yao, Xin Xu, Ningyu Zhang, Huajun Chen

2024-09-10

Benchmarking Chinese Knowledge Rectification in Large Language Models

Summary

This paper talks about a new way to evaluate and improve how large language models (LLMs) understand and generate knowledge in Chinese, especially when dealing with specific cultural references like poetry and idioms.

What's the problem?

Large language models can sometimes create incorrect or nonsensical information, especially when working with complex languages like Chinese. This is particularly problematic when they encounter ancient poetry, proverbs, or idioms that require specific cultural knowledge. Existing methods do not adequately address these issues, leading to poor performance in generating accurate content.

What's the solution?

To tackle this problem, the authors introduce a benchmark called CKnowEdit, which consists of a new dataset that includes various types of Chinese knowledge from classical texts and online sources. This dataset helps identify the challenges LLMs face in understanding Chinese. The paper also evaluates current techniques for editing knowledge in these models, highlighting areas where improvements are needed.

Why it matters?

This research is important because it aims to enhance the performance of language models when dealing with Chinese text, making them more reliable for tasks that involve cultural nuances. By improving how these models handle specific knowledge, we can create better tools for education, translation, and other applications that rely on accurate language understanding.

Abstract

While Large Language Models (LLMs) exhibit remarkable generative capabilities, they are not without flaws, particularly in the form of hallucinations. This issue is even more pronounced when LLMs are applied to specific languages and domains. For example, LLMs may generate nonsense information when handling Chinese ancient poetry, proverbs, or idioms, owing to the lack of specific knowledge. To this end, this paper introduces a benchmark for rectifying Chinese knowledge in LLMs via knowledge editing. Specifically, we introduce a new Chinese dataset, CKnowEdit, by collecting seven type of knowledge from various sources, including classical texts, idioms, and content from Baidu Tieba Ruozhiba, thereby accounting for the unique polyphony, antithesis, and logical constructs inherent in the Chinese language. Through the analysis of this dataset, we uncover the challenges faced by current LLMs in mastering Chinese. Furthermore, our evaluation of state-of-the-art knowledge editing techniques on this dataset unveil the substantial scope for advancement in the rectification of Chinese knowledge. Code and dataset are available at https://github.com/zjunlp/EasyEdit.

View Paper