Seq vs Seq: An Open Suite of Paired Encoders and Decoders
Orion Weller, Kathryn Ricci, Marc Marone, Antoine Chaffin, Dawn Lawrie, Benjamin Van Durme
2025-07-17
Summary
This paper talks about the Ettin suite, which is a set of AI models that includes both encoder-only and decoder-only architectures to study how well each type performs on different tasks.
What's the problem?
The problem is that it's unclear which kind of model architecture works best for tasks like classifying information, retrieving data, or generating new content, and if training one type of model for tasks suited to another type is effective.
What's the solution?
The authors tested encoder-only models, which are designed to understand and categorize data, and decoder-only models, which focus on creating new data. They found that encoder-only models are better for tasks like classification and retrieval, while decoder-only models excel at generating text or content. They also showed that trying to adapt one type of model to a task meant for the other type by extra training does not work well.
Why it matters?
This matters because it helps researchers and engineers pick the right model design for specific tasks, saving time and resources while improving AI performance by using models best suited to their intended purpose.
Abstract
The Ettin suite of models demonstrates that encoder-only and decoder-only architectures perform optimally in their respective tasks, with encoder-only models excelling at classification and retrieval, and decoder-only models at generation, and that adapting models to different tasks through continued training is less effective.