ReLiK: Retrieve and LinK, Fast and Accurate Entity Linking and Relation Extraction on an Academic Budget

Riccardo Orlando, Pere-Lluis Huguet-Cabot, Edoardo Barba, Roberto Navigli

2024-08-05

ReLiK: Retrieve and LinK, Fast and Accurate Entity Linking and Relation Extraction on an Academic Budget

Summary

This paper presents ReLiK, a new system designed for quickly and accurately linking entities (like people or organizations) and extracting relationships between them from text. It uses a two-part model called a Retriever-Reader architecture to improve performance in natural language processing tasks.

What's the problem?

Entity linking and relation extraction are important tasks in understanding text, but existing methods often struggle with speed and accuracy. Traditional systems can be slow because they need to process each potential entity or relationship separately, which makes it hard to efficiently analyze large amounts of text.

What's the solution?

ReLiK addresses these issues by using a Retriever-Reader architecture. The Retriever identifies possible entities and relationships from the text, while the Reader determines which of these are relevant. A key innovation is that ReLiK combines the candidate entities and relations with the input text in a single processing step, allowing it to work much faster—up to 40 times quicker than other methods—while still achieving high accuracy. This system can also handle both tasks simultaneously, making it more efficient than previous models.

Why it matters?

This research is significant because it offers a more efficient way to extract useful information from text, which is essential for applications like knowledge graphs, automated research assistance, and data analysis. By improving how machines understand relationships in text, ReLiK can help organizations make better decisions based on the information they gather.

Abstract

Entity Linking (EL) and Relation Extraction (RE) are fundamental tasks in Natural Language Processing, serving as critical components in a wide range of applications. In this paper, we propose ReLiK, a Retriever-Reader architecture for both EL and RE, where, given an input text, the Retriever module undertakes the identification of candidate entities or relations that could potentially appear within the text. Subsequently, the Reader module is tasked to discern the pertinent retrieved entities or relations and establish their alignment with the corresponding textual spans. Notably, we put forward an innovative input representation that incorporates the candidate entities or relations alongside the text, making it possible to link entities or extract relations in a single forward pass and to fully leverage pre-trained language models contextualization capabilities, in contrast with previous Retriever-Reader-based methods, which require a forward pass for each candidate. Our formulation of EL and RE achieves state-of-the-art performance in both in-domain and out-of-domain benchmarks while using academic budget training and with up to 40x inference speed compared to competitors. Finally, we show how our architecture can be used seamlessly for Information Extraction (cIE), i.e. EL + RE, and setting a new state of the art by employing a shared Reader that simultaneously extracts entities and relations.

View Paper