GroupRank: A Groupwise Reranking Paradigm Driven by Reinforcement Learning

Duolin Sun, Meixiu Long, Dan Yang, Yihan Jiao, Zhehao Tan, Jie Feng, Junjie Wang, Yue Shen, Peng Wei, Jian Wang, Jinjie Gu

2025-11-18

GroupRank: A Groupwise Reranking Paradigm Driven by Reinforcement Learning

Summary

This paper focuses on improving how we choose the best documents for a Retrieval-Augmented Generation (RAG) system, which combines searching for information with using a large language model to create answers. They're trying to make the document selection process more accurate.

What's the problem?

Currently, there are two main ways to re-rank documents after an initial search. One way, called 'pointwise,' looks at each document individually, which is easy but can miss how documents relate to each other – it might pick a good document but miss a *better* one in the set. The other way, 'listwise,' considers all documents together, but this becomes incredibly slow and difficult to manage when you have a lot of potential documents to choose from. It's a trade-off between accuracy and practicality.

What's the solution?

The researchers propose a new method called 'groupwise' reranking. Instead of looking at documents one by one or all at once, they divide the documents into smaller groups. The large language model then compares documents *within* each group to determine their relevance to the query. This keeps the flexibility of looking at documents individually while also allowing the model to understand relationships between them. They also developed a way to create realistic training data to help the model learn effectively, and a special training process to improve performance.

Why it matters?

This research is important because it addresses a key bottleneck in RAG systems. By making reranking more efficient and accurate, we can improve the quality of answers generated by these systems, especially when dealing with complex questions that require careful consideration of multiple sources. Better RAG systems mean more reliable and helpful AI assistants.

Abstract

Large Language Models have shown strong potential as rerankers to enhance the overall performance of RAG systems. However, existing reranking paradigms are constrained by a core theoretical and practical dilemma: Pointwise methods, while simple and highly flexible, evaluate documents independently, making them prone to the Ranking Myopia Trap, overlooking the relative importance between documents. In contrast, Listwise methods can perceive the global ranking context, but suffer from inherent List Rigidity, leading to severe scalability and flexibility issues when handling large candidate sets. To address these challenges, we propose Groupwise, a novel reranking paradigm. In this approach, the query and a group of candidate documents are jointly fed into the model, which performs within-group comparisons to assign individual relevance scores to each document. This design retains the flexibility of Pointwise methods while enabling the comparative capability of Listwise methods. We further adopt GRPO for model training, equipped with a heterogeneous reward function that integrates ranking metrics with a distributional reward aimed at aligning score distributions across groups. To overcome the bottleneck caused by the scarcity of high quality labeled data, we further propose an innovative pipeline for synthesizing high quality retrieval and ranking data. The resulting data can be leveraged not only for training the reranker but also for training the retriever. Extensive experiments validate the effectiveness of our approach. On two reasoning intensive retrieval benchmarks, BRIGHT and R2MED.

View Paper