Configurable Preference Tuning with Rubric-Guided Synthetic Data

Víctor Gallego

2025-06-16

Configurable Preference Tuning with Rubric-Guided Synthetic Data

Summary

This paper talks about Configurable Preference Tuning (CPT), a new way to help large language models change how they behave based on easy-to-understand instructions from humans. Instead of having one fixed preferred way to answer, the models can adjust their style or content depending on specific guidelines given to them during use.

What's the problem?

The problem is that most models are trained with a single set of preferences, meaning they always respond in one fixed way and can't easily adapt to different situations or user needs. This limits how flexible and useful the models can be because they can't change behavior without retraining.

What's the solution?

The solution introduced is to create synthetic data guided by detailed rubrics—these are structured instructions that describe exactly how the model should behave, like changing its tone or following certain rules. The model is fine-tuned on this data so it learns to respond differently depending on the rubric given at runtime, allowing it to adjust its behavior dynamically without needing to be trained again for each new style or preference.

Why it matters?

This matters because it makes AI language models more flexible and controllable, letting people customize how the AI answers in real time. It also helps the AI better match complicated and changing human preferences, making interactions more natural and useful without extra training each time preferences change.

Abstract

Configurable Preference Tuning enables language models to dynamically adjust their behavior based on human-interprettable directives, using rubric-guided preference data for fine-tuning and inference-time modulation.

View Paper