RAG Foundry: A Framework for Enhancing LLMs for Retrieval Augmented Generation

Daniel Fleischer, Moshe Berchansky, Moshe Wasserblat, Peter Izsak

2024-08-06

RAG Foundry: A Framework for Enhancing LLMs for Retrieval Augmented Generation

Summary

This paper introduces RAG Foundry, an open-source framework designed to enhance large language models (LLMs) for Retrieval-Augmented Generation (RAG) tasks. It streamlines the process of creating, training, and evaluating models that can retrieve information and generate text based on that information.

What's the problem?

Building effective RAG systems is complex and requires a deep understanding of data and design choices. Evaluating these systems is also challenging because it involves checking both how accurately they retrieve information and how well they generate text. This complexity makes it hard for researchers to develop and test new RAG techniques efficiently.

What's the solution?

RAG Foundry addresses these challenges by providing a unified workflow that integrates data creation, model training, inference (making predictions), and evaluation. This allows users to easily create datasets, train models, and assess their performance without having to manage each step separately. The framework has been tested with models like Llama-3 and Phi-3, showing consistent improvements in performance across various knowledge-intensive tasks.

Why it matters?

This research is important because it simplifies the development of advanced AI systems that can effectively combine retrieval and generation tasks. By making it easier for researchers and developers to work with RAG models, RAG Foundry can lead to better applications in areas like question answering, chatbots, and other interactive AI systems.

Abstract

Implementing Retrieval-Augmented Generation (RAG) systems is inherently complex, requiring deep understanding of data, use cases, and intricate design decisions. Additionally, evaluating these systems presents significant challenges, necessitating assessment of both retrieval accuracy and generative quality through a multi-faceted approach. We introduce RAG Foundry, an open-source framework for augmenting large language models for RAG use cases. RAG Foundry integrates data creation, training, inference and evaluation into a single workflow, facilitating the creation of data-augmented datasets for training and evaluating large language models in RAG settings. This integration enables rapid prototyping and experimentation with various RAG techniques, allowing users to easily generate datasets and train RAG models using internal or specialized knowledge sources. We demonstrate the framework effectiveness by augmenting and fine-tuning Llama-3 and Phi-3 models with diverse RAG configurations, showcasing consistent improvements across three knowledge-intensive datasets. Code is released as open-source in https://github.com/IntelLabs/RAGFoundry.

View Paper