Linear Correlation in LM's Compositional Generalization and Hallucination

Letian Peng, Chenyang An, Shibo Hao, Chengyu Dong, Jingbo Shang

2025-02-10

Linear Correlation in LM's Compositional Generalization and
Hallucination

Summary

This paper talks about how large language models (LLMs) can use linear relationships to connect related pieces of knowledge, helping them generalize better but also sometimes causing mistakes called hallucinations.

What's the problem?

Language models often struggle to combine different pieces of knowledge correctly, especially when the relationships between them aren't clear. This can make them produce errors or unrealistic outputs, which limits their ability to handle complex reasoning tasks.

What's the solution?

The researchers found that LLMs use a type of linear transformation to connect related knowledge, like linking 'Paris' to 'France.' They showed that this linear process is strong enough to survive even after the model is updated or fine-tuned. However, when the model tries to apply these connections incorrectly, it can lead to hallucinations. They also demonstrated that this linear behavior can be learned using simple tools like a feedforward network and vocabulary representations.

Why it matters?

This matters because understanding how LLMs generalize knowledge can help improve their accuracy and reliability. It also provides a way to predict when they might make mistakes, which is important for creating smarter and safer AI systems. By studying these linear relationships, researchers can design better models for tasks like answering questions or solving problems.

Abstract

The generalization of language models (LMs) is undergoing active debates, contrasting their potential for general intelligence with their struggles with basic knowledge composition (e.g., reverse/transition curse). This paper uncovers the phenomenon of <PRE_TAG>linear correlations</POST_TAG> in LMs during knowledge composition. For explanation, there exists a linear transformation between certain related knowledge that maps the next token prediction logits from one prompt to another, e.g., "X lives in the city of" rightarrow "X lives in the country of" for every given X. This mirrors the linearity in human knowledge composition, such as Paris rightarrow France. Our findings indicate that the linear transformation is resilient to large-scale fine-tuning, generalizing updated knowledge when aligned with real-world relationships, but causing hallucinations when it deviates. Empirical results suggest that linear correlation can serve as a potential identifier of LM's generalization. Finally, we show such <PRE_TAG>linear correlations</POST_TAG> can be learned with a single feedforward network and pre-trained vocabulary representations, indicating LM generalization heavily relies on the latter.

View Paper