Eliminating Position Bias of Language Models: A Mechanistic Approach

Ziqi Wang, Hanlin Zhang, Xiner Li, Kuan-Hao Huang, Chi Han, Shuiwang Ji, Sham M. Kakade, Hao Peng, Heng Ji

2024-07-04

Eliminating Position Bias of Language Models: A Mechanistic Approach

Summary

This paper talks about a new approach to eliminate position bias in language models, which can negatively affect their performance and reliability when processing information based on where it appears in the input.

What's the problem?

The main problem is that modern language models often give more importance to words or content based on their position in the text. This position bias can lead to unexpected errors and makes the models less reliable in various tasks, such as answering questions or understanding context. It is caused by two main components: causal attention, which tends to favor distant information, and relative positional encodings, which prefer nearby information.

What's the solution?

To address this issue, the authors propose a method called Position-INvariant inferencE (PINE) that eliminates position bias without needing additional training. Instead of relying on the order of segments provided in the input, their method uses bidirectional attention to consider all segments equally. This allows the model to better understand the relationships between different pieces of information, regardless of their position in the input. The authors demonstrate that this approach leads to significant improvements in performance for tasks like question answering and reasoning.

Why it matters?

This research is important because it enhances the effectiveness of language models by making them more reliable and robust. By eliminating position bias, these models can perform better in real-world applications, such as evaluating reasoning or answering questions accurately. This advancement can lead to more trustworthy AI systems that can be applied across various fields.

Abstract

Position bias has proven to be a prevalent issue of modern language models (LMs), where the models prioritize content based on its position within the given context. This bias often leads to unexpected model failures and hurts performance, robustness, and reliability across various applications. Our mechanistic analysis attributes the position bias to two components employed in nearly all state-of-the-art LMs: causal attention and relative positional encodings. Specifically, we find that causal attention generally causes models to favor distant content, while relative positional encodings like RoPE prefer nearby ones based on the analysis of retrieval-augmented question answering (QA). Further, our empirical study on object detection reveals that position bias is also present in vision-language models (VLMs). Based on the above analyses, we propose to ELIMINATE position bias caused by different input segment orders (e.g., options in LM-as-a-judge, retrieved documents in QA) in a TRAINING-FREE ZERO-SHOT manner. Our method changes the causal attention to bidirectional attention between segments and utilizes model attention values to decide the relative orders of segments instead of using the order provided in input prompts, therefore enabling Position-INvariant inferencE (PINE) at the segment level. By eliminating position bias, models achieve better performance and reliability in downstream tasks where position bias widely exists, such as LM-as-a-judge and retrieval-augmented QA. Notably, PINE is especially useful when adapting LMs for evaluating reasoning pairs: it consistently provides 8 to 10 percentage points performance gains in most cases, and makes Llama-3-70B-Instruct perform even better than GPT-4-0125-preview on the RewardBench reasoning subset.

View Paper