Optimizing Diversity and Quality through Base-Aligned Model Collaboration

Yichen Wang, Chenghao Yang, Tenghao Huang, Muhao Chen, Jonathan May, Mina Lee

2025-11-12

Optimizing Diversity and Quality through Base-Aligned Model Collaboration

Summary

This paper introduces a new method called Base-Aligned Model Collaboration, or BACo, which aims to get the best of both worlds from large language models: high-quality, helpful responses *and* diverse, creative outputs.

What's the problem?

Large language models have become really good at giving safe and useful answers, but in the process, they've become too similar in how they respond. You often get the same answer no matter how you ask the question, which limits their creativity and ability to explore different ideas. Existing attempts to increase diversity often sacrifice the quality of the responses or require a lot of extra computing power.

What's the solution?

BACo works by cleverly combining two versions of a language model during the response generation process: a 'base' model and an 'aligned' model. At each step of creating a response, the system decides which model should contribute the next word, based on how uncertain the model is about its prediction and what role that word plays in the overall meaning. This dynamic collaboration allows BACo to balance quality and diversity without needing to retrain the models or do a lot of extra processing.

Why it matters?

This research shows that we can improve the creativity and variety of large language model outputs without making them less helpful or accurate. It suggests a promising new direction for building AI systems that are both reliable and imaginative, offering more control over the kind of responses they generate.

Abstract

Alignment has greatly improved large language models (LLMs)' output quality at the cost of diversity, yielding highly similar outputs across generations. We propose Base-Aligned Model Collaboration (BACo), an inference-time token-level model collaboration framework that dynamically combines a base LLM with its aligned counterpart to optimize diversity and quality. Inspired by prior work (Fei et al., 2025), BACo employs routing strategies that determine, at each token, from which model to decode based on next-token prediction uncertainty and predicted contents' semantic role. Prior diversity-promoting methods, such as retraining, prompt engineering, and multi-sampling methods, improve diversity but often degrade quality or require costly decoding or post-training. In contrast, BACo achieves both high diversity and quality post hoc within a single pass, while offering strong controllability. We explore a family of routing strategies, across three open-ended generation tasks and 13 metrics covering diversity and quality, BACo consistently surpasses state-of-the-art inference-time baselines. With our best router, BACo achieves a 21.3% joint improvement in diversity and quality. Human evaluations also mirror these improvements. The results suggest that collaboration between base and aligned models can optimize and control diversity and quality.

View Paper