FlexOlmo: Open Language Models for Flexible Data Use

Weijia Shi, Akshita Bhagia, Kevin Farhat, Niklas Muennighoff, Pete Walsh, Jacob Morrison, Dustin Schwenk, Shayne Longpre, Jake Poznanski, Allyson Ettinger, Daogao Liu, Margaret Li, Dirk Groeneveld, Mike Lewis, Wen-tau Yih, Luca Soldaini, Kyle Lo, Noah A. Smith, Luke Zettlemoyer, Pang Wei Koh, Hannaneh Hajishirzi, Ali Farhadi

2025-07-10

FlexOlmo: Open Language Models for Flexible Data Use

Summary

This paper talks about FlexOlmo, a new type of language model that can be trained using data from different places without sharing the actual data itself. It uses a system called mixture-of-experts, where different parts of the model, called experts, are trained separately on private datasets and then combined into one model.

What's the problem?

The problem is that most language models need all the training data to be in one place to work well, but sometimes data is private, protected by laws, or just too big to share. This makes it hard for organizations to collaborate and build better models together.

What's the solution?

The researchers designed FlexOlmo to allow each expert to be trained independently alongside a shared public model so they can later be merged without retraining together. They also created a system that routes different inputs to the right expert according to the data it was trained on, letting users include or exclude experts during use without extra training.

Why it matters?

This matters because it makes it easier and safer for organizations to collaborate on AI development while keeping their data private. It also allows for continuous updating and flexible control over how data influences the model, which improves AI performance and respects privacy.

Abstract

FlexOlmo, a distributed language model using a mixture-of-experts architecture, allows independent training on closed datasets and flexible inference without further training, improving performance while respecting data privacy.

View Paper